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Preface 


S ince the early 1970s, when recombinant DNA technology was first 
developed, there has been a veritable explosion of knowledge in the 
biological sciences. Since that time, with the advent of PCR, chemical 
DNA synthesis, DNA sequencing, monoclonal antibodies, directed muta¬ 
genesis, genomics, proteomics, and metabolomics, our understanding of 
and ability to manipulate the biological world have grown exponentially. 
When the first edition of Molecular Biotechnology: Principles and Applications 
of Recombinant DNA was published in 1994, nearly all of the transgenic 
organisms that were produced included only a single introduced gene. Just 
15 years later, it is not uncommon for researchers to engineer organisms by 
modifying both the activity and the regulation of existing genes while at 
the same time introducing entire new pathways. In 1994, only a handful of 
products produced by this new technology were available in the market¬ 
place. Today, molecular biotechnology has given us several hundred new 
therapeutic agents, with many more in the pipeline, as well as dozens of 
transgenic plants. The use of DNA has become a cornerstone of modern 
forensics, paternity testing, and ancestry determination. Several new 
recombinant vaccines have been developed, with many more on the 
horizon. The list goes on and on. Molecular biotechnology really has lived 
up to its promise, to all of the original hype. It has been estimated that 
worldwide there are currently several thousand biotechnology companies 
employing tens of thousands of scientists. When the exciting science being 
done at universities, government labs, and research institutes around the 
world is factored in, the rate of change and of discovery in the biological 
sciences is astounding. This fourth edition of Molecular Biotechnology, 
building upon the fundamentals that were established in the previous three 
editions, endeavors to provide readers with a window on some of the 
major developments in this growing field in the past several years. Of 
necessity, we have had to be highly selective in the material that is included 
in this edition. Moreover, the window that we are looking through is 
moving. This notwithstanding, we both expect and look forward to the 
commercialization of many of these discoveries as well as to the develop¬ 
ment of new approaches, insights, and discoveries. 

Bernard R. Glick 
Jack J. Pasternak 
Cheryl L. Patten 
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Preface to the First Edition 


M olecular biotechnology emerged as a new research field that 
arose as a result of the fusion in the late 1970s of recombinant 
DNA technology and traditional industrial microbiology. Whether 
one goes to the movies to see Jurassic Park with its ingenious but scientifi¬ 
cally untenable plot of cloning dinosaurs, reads in the newspaper about the 
commercialization of a new "biotech" tomato that has an extended shelf 
life, or hears one of the critics of molecular biotechnology talking about the 
possibility of dire consequences from genetic engineering, there is a sig¬ 
nificant public awareness about recombinant DNA technology In this 
book, we introduce and explain what molecular biotechnology actually is 
as a scientific discipline, how the research in the area is conducted, and 
how this technology may realistically impact on our lives in the future. 

We have written Molecular Biotechnology: Principles and Applications of 
Recombinant DNA to serve as a text for courses in biotechnology, recombi¬ 
nant DNA technology, and genetic engineering or for any course intro¬ 
ducing both the principles and the applications of contemporary molecular 
biology methods. The book is based on the biotechnology course we have 
offered for the past 12 years to advanced undergraduate and graduate stu¬ 
dents from the biological and engineering sciences at the University of 
Waterloo. We have written this text for students who have an under¬ 
standing of basic ideas from biochemistry, molecular genetics, and micro¬ 
biology. We are aware that it is unlikely that students will have had all of 
these courses before taking a course on biotechnology. Thus, we have tried 
to develop the topics in this text by explaining their broader biological 
context before delving into molecular details. 

This text emphasizes how recombinant DNA technology can be used 
to create various useful products. We have, wherever possible, used exper¬ 
imental results and actual methodological strategies to illustrate basic con¬ 
cepts, and we have tried to capture the flavor and feel of how molecular 
biotechnology operates as a scientific venture. The examples that we have 
selected—from a vast and rapidly growing literature—were chosen as case 
studies that not only illustrate particular points but also provide the reader 
with a solid basis for understanding current research in specialized areas of 
molecular biotechnology. Nevertheless, we expect that some of our exam¬ 
ples will be out of date by the time the book is published, because molec¬ 
ular biotechnology is such a rapidly changing discipline. 


xv 
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PREFACE TO THE FIRST EDITION 


For the ease of the day-to-day practitioners, scientific disciplines often 
develop specialized terms and nomenclature. We have tried to minimize 
the use of technical jargon and, in many instances, have deliberately used 
a simple phrase to describe a phenomenon or process that might otherwise 
have been expressed more succinctly with technical jargon. In any field of 
study, synonymous terms that describe the same phenomenon exist. In 
molecular biotechnology, for example, recombinant DNA technology, gene 
cloning, and genetic engineering, in a broad sense, have the same meaning. 
When an important term or concept appears for the first time in this text, it 
is followed in parentheses with a synonym or equivalent expression. An 
extensive glossary can be found at the end of the book to help the reader 
with the terminology of molecular biotechnology. 

Each chapter opens with an outline of topics and concludes with a 
detailed summary and list of review questions to sharpen students' critical 
thinking skills. All of the key ideas in the book are carefully illustrated by 
the more than 200 full-color diagrams in the pedagogical belief that a pic¬ 
ture is indeed worth a thousand words. After introducing molecular bio¬ 
technology as a scientific and economic venture in Chapter 1, the next five 
chapters (2 to 6) deal with the methodologies of molecular biotechnology. 
The chapters of Part I act as a stepping-stone for the remainder of the book. 
Chapters 7 to 12 in Part II present examples of microbial molecular biotech¬ 
nology covering such topics as the production of metabolites, vaccines, 
therapeutics, diagnostics, bioremediation, biomass utilization, bacterial 
fertilizers, and microbial pesticides. Chapter 13 describes some of the key 
components of large-scale fermentation processes using genetically engi¬ 
neered (recombinant) microorganisms. In Part III, we deal with the molec¬ 
ular biotechnology of plants and animals (Chapters 14 and 15). The 
isolation of human disease-causing genes by using recombinant DNA tech¬ 
nology and how, although it is in its early stages, genetic manipulation is 
being currently contemplated for the treatment of human diseases are pre¬ 
sented in Chapters 16 and 17. The book concludes with coverage of the 
regulation of molecular biotechnology and patents in Part IV. 

A brief mention should be made about the reference sections that 
follow each chapter. Within many of the chapters we have relied upon the 
published work of various researchers. In all cases, although not cited 
directly in the body of a chapter, the original published articles are noted in 
the reference section of the appropriate chapter. In some cases, we have 
taken "pedagogic license" and either extracted or reformulated data from 
the original publications. Clearly, we are responsible for any distortions or 
misrepresentations from these simplifications, although we hope that none 
has occurred. The reference sections also contain other sources that we 
used in a general way, which might, if consulted, bring the readers closer 
to a particular subject. 
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M olecular biotechnology is an exciting scientific discipline that is 
based on the ability of researchers to transfer specific units of 
genetic information from one organism to another. This convey¬ 
ance of a gene or genes relies on the techniques of genetic engineering 
(recombinant DNA technology). The objective of recombinant DNA tech¬ 
nology is often to create a useful product or a commercial process. In part I, 
the concept of molecular biotechnology, some fundamentals of molecular 
biology, and recombinant DNA procedures are presented. Essential molec¬ 
ular biotechnology laboratory techniques, including chemical synthesis of 
genes, the polymerase chain reaction (PCR), and DNA sequencing, are dis¬ 
cussed. Developments in sequencing technologies have led to the sequencing 
of the entire genomes of many organisms, and this has enabled researchers 
to begin to understand organisms from their sequences and to identify 
novel genes with potentially useful functions. In addition to isolation 
(cloning) of genes, it is important that these genes function properly in a 
host organism. To this end, strategies for optimizing the expression of a 
cloned gene in either prokaryotic or eukaryotic cells are reviewed. Finally, 
procedures for modifying cloned genes by the introduction of specific 
nucleotide changes (in vitro mutagenesis) to enhance the properties of the 
target proteins are examined. Together, the chapters in part I provide the 
conceptual and technical underpinnings for understanding the applications 
of molecular biotechnology that are described in the ensuing chapters. 
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The Development of 
Molecular Biotechnology 


The Emergence of Molecular Biotechnology 

L ong before we knew that microorganisms existed or that genes were 
the units of inheritance, humans looked to the natural world to 
develop methods to increase food production, preserve food, and 
heal the sick. Our ancestors discovered that grains could be preserved 
through fermentation into beer; that storing horse saddles in a warm, damp 
comer of the stable resulted in the growth of a saddle mold that could heal 
infected saddle sores; and that intentional exposure to a "contagion" could 
somehow provide protection from an infectious disease on subsequent 
exposure. Since the discovery of the microscopic world in the 17th century, 
microorganisms have been employed in the development of numerous 
useful processes and products. Many of these are found in our households 
and backyards. Lactic acid bacteria are used to prepare yogurt and probi¬ 
otics, insecticide-producing bacteria are sprayed on many of the plants 
from which the vegetables in our refrigerator were harvested, nitrogen¬ 
fixing bacteria are added to the soil used for cultivation of legumes, the 
enzymatic stain removers in laundry detergent came from a microor¬ 
ganism, and antibiotics derived from common soil microbes are used to 
treat infectious diseases. These are just a few examples of traditional bio¬ 
technologies that have improved our lives. Up to the early 1970s, however, 
traditional biotechnology was not a well-recognized scientific discipline, 
and research in this area was centered in departments of chemical engi¬ 
neering and occasionally in specialized microbiology programs. 

In a broad sense, biotechnology is concerned with the production of 
commercial products generated by the metabolic action of microorganisms. 
More formally, biotechnology may be defined as "the application of scien¬ 
tific and engineering principles to the processing of material by biological 
agents to provide goods and services." The term "biotechnology" was first 
used in 1917 by a Hungarian engineer, Karl Ereky, to describe an integrated 
process for the large-scale production of pigs by using sugar beets as the 
source of food. According to Ereky, biotechnology was "all lines of work by 
which products are produced from raw materials with the aid of living 
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FIGURE 1.1 Principal steps of a bioengi¬ 
neered biotechnology process. Paren¬ 
thetically, Karl Ereky's scheme entailed 
using inexpensive sugar beets (raw 
material) to feed pigs (biotransforma¬ 
tion) for the production of pork (down¬ 
stream processing). 


things." This fairly precise definition was more or less ignored. For a 
number of years, the term biotechnology was used to describe two very 
different engineering disciplines. On one hand, it referred to industrial 
fermentation. On the other, it was used for the study of efficiency in the 
workplace—what is now called ergonomics. This ambiguity ended in 1961 
when the Swedish microbiologist Carl Goran Heden recommended that 
the title of a scientific journal dedicated to publishing research in the fields 
of applied microbiology and industrial fermentation be changed from the 
Journal of Microbiological and Biochemical Engineering and Technology to 
Biotechnology and Bioengineering. From that time on, biotechnology has 
clearly and irrevocably been associated with the study of "the industrial 
production of goods and services by processes using biological organisms, 
systems, and processes," and it has been firmly grounded in expertise in 
microbiology, biochemistry, and chemical engineering. 

An industrial biotechnology process that uses microorganisms for pro¬ 
ducing a commercial product typically has three key stages (Fig. 1.1): 

1. Upstream processing: preparation of the microorganism and the 
raw materials required for the microorganism to grow and pro¬ 
duce the desired product 

2. Fermentation and transformation: growth (fermentation) of the 
target microorganism in a large bioreactor (usually >100 liters) 
with the consequent production (biotransformation) of a desired 
compound, which can be, for example, an antibiotic, an amino 
acid, or a protein 

3. Downstream processing: purification of the desired compound 
from either the cell medium or the cell mass 

Biotechnology research is dedicated to maximizing the overall effi¬ 
ciency of each of these steps and to finding microorganisms that make 
products that are useful in the preparation of foods, food supplements, and 
drugs. During the 1960s and 1970s, this research focused on upstream pro¬ 
cessing, bioreactor design, and downstream processing. These studies led 
to enhanced bioinstrumentation for monitoring and controlling the fer¬ 
mentation process and to efficient large-scale growth facilities that increased 
the yields of various products. 

The biotransformation component of the overall process was the most 
difficult phase to manipulate. Commodity production by naturally occur¬ 
ring microbial strains on a large scale was often considerably less than 
optimal. Initial efforts to enhance product yields focused on creating vari¬ 
ants (mutants) by using chemical mutagens or ultraviolet radiation to 
induce changes in the genetic constitution of existing strains. However, the 
level of improvement that could be achieved in this way was usually lim¬ 
ited biologically. If a mutated strain, for example, synthesized too much of 
a compound, other metabolic functions often were impaired, thereby 
causing the strain's growth during large-scale fermentation to be less than 
desired. Despite this constraint, the traditional "induced mutagenesis and 
selection" strategies of strain improvement were extremely successful for a 
number of processes, such as the production of antibiotics. 

The traditional genetic improvement regimens were tedious, time- 
consuming, and costly because of the large numbers of colonies that had to 
be selected, screened, and tested. Moreover, the best result that could be 
expected with this approach was the improvement of an existing inherited 
property of a strain rather than the expansion of its genetic capabilities. 
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Despite these limitations, by the late 1970s, effective processes for the mass 
production of a wide range of commercial products had been perfected. 

Today, we have acquired sufficient knowledge of the biochemistry, 
genetics, and molecular biology of microorganisms to accelerate the devel¬ 
opment of useful and improved biological products and processes and to 
create new products that would not otherwise occur. Distinct from tradi¬ 
tional biotechnology, the modem methods require knowledge of and 
manipulation of genes, the functional units of inheritance, and the discipline 
that is concerned with the manipulation of genes for the purpose of pro¬ 
ducing useful goods and services using living organisms is known as 
molecular biotechnology. The pivotal development that enabled this tech¬ 
nology was the establishment of techniques to isolate genes and to transfer 
them from one organism to another. This technology is known as recombi¬ 
nant deoxyribonucleic acid (DNA) technology, and it began as a lunchtime 
conversation between two scientists working in different fields who met at 
a scientific conference in 1973. In his laboratory at Stanford University in 
California, Stanley Cohen had been developing methods to transfer plas¬ 
mids, small circular DNA molecules, into bacterial cells. Meanwhile, Herbert 
Boyer of the University of California at San Francisco was working with 
enzymes that cut DNA at specific nucleotide sequences. Over lunch at a 
scientific meeting, they reasoned that Boyer's enzyme could be used to 
splice a specific segment of DNA into a plasmid and then the recombinant 
plasmid could be introduced into a host bacterium using Cohen's method. 


Recombinant DNA Technology 

It was clear to Cohen and Boyer and others that recombinant DNA tech¬ 
nology had far-reaching possibilities. As Cohen noted at the time, "It may be 
possible to introduce in E. coli, genes specifying metabolic or synthetic func¬ 
tions such as photosynthesis or antibiotic production indigenous to other 
biological classes." The first commercial product produced using recombi¬ 
nant DNA technology was human insulin, which is used in the treatment of 
diabetes. The DNA sequence that encodes human insulin was synthesized, 
a remarkable feat in itself at the time, and was transplanted into a plasmid 
that could be maintained in the common bacterium Escherichia coli. The bac¬ 
terial host cells acted as biological factories for the production of the two 
peptide chains of human insulin, which, after being combined, could be 
purified and used to treat diabetics who were allergic to the commercially 
available porcine (pig) insulin. In the previous decade, this achievement 
would have seemed absolutely impossible. By today's standards, however, 
this type of genetic engineering is considered commonplace. 

The nature of biotechnology was changed forever by the development 
of recombinant DNA technology. With these techniques, the maximization 
of the biotransformation phase of a biotechnology process was achieved 
more directly. Genetic engineering provided the means to create, rather 
than merely isolate, highly productive strains. Not long after the production 
of the first commercial preparation of recombinant human insulin, bacteria 
and then eukaryotic cells were used for the production of insulin, inter¬ 
feron, growth hormone, viral antigens, and a variety of other therapeutic 
proteins. Recombinant DNA technology could also be used to facilitate the 
biological production of large amounts of useful low-molecular-weight 
compounds and macromolecules that occur naturally in minuscule quanti¬ 
ties. Plants and animals became targets to act as natural bioreactors for 
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producing new or altered gene products that could never have been cre¬ 
ated either by mutagenesis and selection or by crossbreeding. Molecular 
biotechnology has become the standard method for developing living sys¬ 
tems with novel functions and capabilities for the synthesis of important 
commercial products. 

Most new scientific disciplines do not arise entirely on their own. They 
are often formed by the amalgamation of knowledge from different areas 
of research. For molecular biotechnology, the biotechnology component 
was perfected by industrial microbiologists and chemical engineers, 
whereas the recombinant DNA technology portion owes much to discov¬ 
eries in molecular biology, bacterial genetics, and nucleic acid enzymology 
(Table 1.1). In a broad sense, molecular biotechnology draws on knowledge 
from a diverse set of fundamental scientific disciplines to create commer¬ 
cial products that are useful in a wide range of applications (Fig. 1.2). 

The Cohen and Boyer strategy for gene cloning was an experiment 
"heard round the world." Once their concept was made public, many other 
researchers immediately appreciated the power of being able to clone genes. 
Consequently, scientists created a large variety of experimental protocols 
that made identifying, isolating, characterizing, and utilizing genes more 
efficient and relatively easy. These technological developments have had an 
enormous impact on generating new knowledge in practically all biological 
disciplines, including animal behavior, developmental biology, molecular 
evolution, cell biology, and human genetics. Indeed, the emergence of the 
field of genomics was dependent on the ability to clone large fragments of 
DNA into plasmids in preparation for sequence determination. 


Commercialization of Molecular Biotechnology 

The potential of recombinant DNA technology reached the public with a 
frenzy of excitement, and many people became rich on its promise. Indeed, 


FIGURE 1.2 Many scientific disciplines contribute to molecular biotechnology, which 
generates a wide range of commercial products. 
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TABLE 1.1 Selected developments in the history of molecular biotechnology 

Date Event 

1917 Karl Ereky coins the term "biotechnology" 

1940 A. Jost coins the term "genetic engineering" 

1943 Penicillin is produced on an industrial scale 

1944 Avery, MacLeod, and McCarty demonstrate that DNA is the genetic material 

1953 Watson and Crick determine the structure of DNA 

1961 The journal Biotechnology and Bioengineering is established 

1961-1966 Entire genetic code is deciphered 

1970 First restriction endonuclease is isolated 

1972 Khorana and coworkers synthesize an entire tRNA gene 

1973 Boyer and Cohen establish recombinant DNA technology 

1975 Kohler and Milstein describe the production of monoclonal antibodies 

1976 First guidelines for the conduct of recombinant DNA research are issued 

1976 Techniques are developed to determine the sequence of DNA 

1978 Genentech produces human insulin in E. coli 

1980 U.S. Supreme Court rules in the case of Diamond v. Chakrabarty that genetically manipulated 

microorganisms can be patented 

1981 First commercial, automated DNA synthesizers are sold 

1981 First monoclonal antibody-based diagnostic kit is approved for use in the United States 

1982 First animal vaccine produced by recombinant DNA methodologies is approved for use in Europe 

1983 Engineered Ti plasmids are used to transform plants 

1988 U.S. patent is granted for a genetically engineered mouse susceptible to cancer 

1988 PCR method is published 

1990 Approval is granted in the United States for a trial of human somatic cell gene therapy 

1990 Human Genome Project is officially initiated 

1990 Recombinant chymosin is used for cheese making in the United States 

1994-1995 Detailed genetic and physical maps of human chromosomes are published 

1994 FDA announces that genetically engineered tomatoes are as safe as conventionally bred tomatoes 

1995 First genome sequence of a cellular organism, the bacterium Haemophilus influenzae, is completed 

1996 First recombinant protein, erythropoietin, exceeds $1 billion in annual sales 

1996 Complete DNA sequence of all the chromosomes of a eukaryotic organism, the yeast Saccharomyces cerevisiae, 

is determined 

1996 Commercial planting of genetically modified crops begins 

1997 Nuclear cloning of a mammal (a sheep) with a differentiated cell nucleus is accomplished 

1998 FDA approves first antisense drug 

1999 FDA approves recombinant fusion protein (diphtheria toxin-interleukin-2) for cutaneous T-cell lymphoma 

2000 Arabidopsis genome is sequenced 

2000 Monoclonal antibodies exceed $2 billion in annual sales 

2000 Development of "golden rice" (provitamin-A-producing rice) is announced 

2000 Over $33 billion is invested in U.S. biotechnology companies 

2001 Human genome is sequenced 

2002 Complete human gene microarrays (gene chips) become commercially available 

2002 FDA approves first nucleic acid test system to screen whole blood from donors for HIV and HCV 

2004 Large-scale sequencing of the Sargasso Sea metagenome begins 

2005 NCBI announces that there are 100 gigabases of nucleotides in the GenBank sequence database 

2006 Recombinant cancer vaccine becomes available to protect against cervical cancer 

2008 Two-billionth acre of genetically engineered crops is planted 

2009 FDA approves first drug produced in a genetically engineered animal (a goat) 

FDA, Food and Drug Administration; HCV, hepatitis C virus; HIV, human immunodeficiency virus; NCBI, National Center for Biotechnology Information; 
PCR, polymerase chain reaction; tRNA, transfer ribonucleic acid. 
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within 20 minutes of the start of trading on the New York Stock Exchange 
on 14 October 1980, the price of shares in Genentech, the company, founded 
by Cohen and Boyer with chemist and entrepreneur Robert Swanson, that 
produced recombinant human insulin, went from $35 to $89. This was the 
fastest increase in the value of any initial public offering in the history of 
the market. It was predicted that some genetically engineered microorgan¬ 
isms would replace chemical fertilizers and others would eat up oil spills, 
plants with inherited resistance to a variety of pests and exceptional nutri¬ 
tional content would be created, and livestock would have faster growing 
times, more efficient feed utilization, and meat with low fat content. Many 
were convinced that as long as a biological characteristic was genetically 
determined by one or a few genes, organisms with novel genetic constitu¬ 
tions could be readily created. Today we see that, despite the commercial 
hype that dominated reality in the beginning, this infatuation with recom¬ 
binant DNA technology was not totally unfounded. A number of the more 
sensible versions of the initial claims, although trimmed in scope, have 
become realities. 

In the 25 years since the commercial production of recombinant human 
insulin, more than 200 new drugs produced by recombinant DNA tech¬ 
nology have been used to treat over 300 million people for diseases such as 
cancer, multiple sclerosis, cystic fibrosis, and strokes and to provide protec¬ 
tion against infectious diseases. Over 400 new drugs are in the process of 
being tested in human trials to treat Alzheimer disease and heart disease 
(to name only two). Similarly, many new molecular biotechnology prod¬ 
ucts for enhancing crop and livestock yields, decreasing pesticide use, and 
improving industrial processes, such as the manufacture of pulp and paper, 
food, energy, and textiles, have been created and are being marketed. 

The impact on agriculture has been tremendous. According to the Food 
and Agriculture Organization of the United Nations, yield improvements 
of all major crops have decreased due to poor agricultural management 
practices, decreased acreage of arable land, and increased reliance on fertil¬ 
izers and pesticides that diminish soil quality. To produce more food on 
less land, 13 million farmers in 25 countries are now planting genetically 
engineered crops on 300 million acres of land. These crops are predomi¬ 
nantly corn, cotton, canola, and soybeans that are resistant to herbicides 
and insects. Over the last 10 years in the United States, genetically engi¬ 
neered crops contributed to $44 million in economic gains due to increased 
yields and lower production costs. The global market value of genetically 
modified crops is currently $7.5 billion. Small resource-poor farmers are 
among the beneficiaries of agricultural biotechnology. In a comparative 
study of small cotton farms in South Africa, it was found that the yield of 
cotton from plants that were genetically engineered to produce a bacterial 
insecticide was on average about 70% greater than those from non-geneti- 
cally modified plants over three seasons. Higher yields and reduced pesti¬ 
cide and labor costs translated into doubled revenues despite the slightly 
higher costs of the transgenic seeds. Similarly, in India, farmers who 
planted genetically modified cotton increased their yields by 31% in 2008 
while decreasing insecticide use by 39%. This resulted in an 88% increase 
in profits for small farmers. 

The ultimate objective of all biotechnology research is the development 
of commercial products. Consequently, molecular biotechnology is driven, 
to a great extent, by economics. Not only does financial investment cur¬ 
rently sustain molecular biotechnology, but clearly the expectation of finan- 
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Construction of Biologically Functional Bacterial 
Plasmids In Vitro 

S. N. Cohen, A. C. Y. Chang, H. W. Boyer, and R. B. Helling 
Proc. Natl. Acad. Sci. USA 70:3240-3244, 1973 


T he landmark study of Cohen et 
al. established the foundation 
for recombinant DNA tech¬ 
nology by showing how genetic infor¬ 
mation from different sources could be 
joined to create a novel, replicatable 
genetic structure. In this instance, the 
new genetic entities were derived 
from bacterial autonomously repli¬ 
cating extrachromosomal DNA struc¬ 
tures called plasmids. In a previous 
study, Cohen and Chang (Proc. Natl. 
Acad. Sci. USA 70:1293-1297,1973) 
produced a small plasmid from a large 
naturally occurring plasmid by 
shearing the larger plasmid into 
smaller random pieces and intro¬ 
ducing the mixture of pieces into a 
host cell, the bacterium E. coli. By 
chance, one of the fragments that was 
about 1/10 the size of the original 
plasmid was perpetuated as a func¬ 
tional plasmid. To overcome the ran¬ 
domness of this approach and to make 


the genetic manipulation of plasmids 
more manageable, Cohen and his 
coworkers decided to use an enzyme 
(restriction endonuclease) that cuts a 
DNA molecule at a specific site and 
produces a short extension at each 
end. The extensions of the cut ends of 
a restriction endonuclease-treated 
DNA molecule can combine with the 
extensions of another DNA molecule 
that has been cleaved with the same 
restriction endonuclease. 

Consequently when DNA mole¬ 
cules from different sources are 
treated with the same restriction endo¬ 
nuclease and mixed together, new 
DNA combinations that never existed 
before can be formed. In this way 
Cohen et al. not only introduced a 
gene from one plasmid into another 
plasmid, but also demonstrated that 
the introduced gene was biologically 
active. To their credit, these authors 
fully appreciated that their strategy 


was "potentially useful for insertion of 
specific sequences from prokaryotic or 
eukaryotic chromosomes or extrachro¬ 
mosomal DNA into independently 
repheating bacterial plasmids." In 
other words, any gene from any 
organism could theoretically be cloned 
into a plasmid, which, after introduc¬ 
tion into a host cell, would be main¬ 
tained indefinitely and, perhaps, 
produce the protein encoded by the 
cloned gene. By demonstrating the 
feasibility of gene cloning, Cohen et al. 
provided the experimental basis for 
recombinant DNA technology; estab¬ 
lished that plasmids could act as vehi¬ 
cles (vectors) for maintaining cloned 
genes; motivated others to pursue 
research in this area that rapidly led to 
the development of more sophisti¬ 
cated vectors and gene-cloning strate¬ 
gies; engendered concerns about the 
safety and ethics of this kind of 
research that, in turn, was responsible 
for the establishment of official guide¬ 
lines and governmental agencies for 
conducting and regulating recombi¬ 
nant DNA research, respectively; and 
contributed to the formation of the 
molecular biotechnology industry 


cial gain was responsible for the considerable interest and excitement 
during the initial stages of its development. By nightfall on 14 October 
1980, the principal shareholders of Genentech stock were worth millions of 
dollars. The unprecedented enthusiastic public response to Genentech 
encouraged others to follow. Between 1980 and 1983, about 200 small bio¬ 
technology companies were founded in the United States with the help of 
tax incentives and funding from both stock market speculation and private 
investment. Like Herbert Boyer, who was first a research scientist at the 
University of California at San Francisco and then a vice president of 
Genentech, university professors started many of the early companies. 

Much of the commercial development of molecular biotechnology has 
been centered in the United States. By 1985, there were over 400 biotech¬ 
nology companies, including many with names that contained variants of 
the word "gene" to emphasize their expertise in gene cloning: Biogen, 
Amgen, Calgene, Engenics, Genex, and Cangene. Today, there are about 
1,500 biotechnology companies in the United States, 3,000 in Europe, and 
more than 8,000 worldwide, most in the health care sector. All large mul¬ 
tinational chemical and pharmaceutical companies, including Monsanto, 
Du Pont, Pfizer, Eli Lilly, GlaxoSmithKline, Merck, Novartis, and 
Hoffmann-LaRoche, to name but a few, have made significant research 
commitments to molecular biotechnology. During the rapid proliferation 
of the biotechnology business in the 1980s, small companies were absorbed 







10 


CHAPTER 1 


by larger ones, strategic mergers took place, and joint ventures were 
undertaken. For example, in 1991,60% of Genentech was sold to Hoffmann- 
LaRoche for $2.1 billion. Also, inevitably, for various reasons, there were a 
number of bankruptcies. This state of flux is a characteristic feature of the 
biotechnology industry. 

The annual earnings of the biotechnology industry have increased from 
about $6 million in 1986 to more than $70 billion in 2003. Worldwide, the 
biotechnology industry employs about 180,000 people. Since the 1980s, new, 
independent molecular biotechnology companies have usually been spe¬ 
cialized and have tended to stress the use of one particular aspect of recom¬ 
binant DNA technology. The extent of this specialization is often reflected in 
their names. For example, after the formation of companies dedicated to the 
cloning of commercially important genes—Biogen, Amgen, Genzyme, 
Genentech, and so on—several U.S. molecular biotechnology companies, 
including ImmunoGen, Immunomedics, and Medlmmune, were formed to 
produce genetically engineered antibodies for treating infectious diseases, 
cancer, and other disorders in humans. Currently, the roster of biotech¬ 
nology companies is extensive and includes those focused on cardiovas¬ 
cular disorders, tissue engineering, cell replacement, drug delivery, vaccines, 
gene therapy, antisense drugs, microarray detection systems, diagnostics, 
genomics, proteomics, and agricultural biotechnology. 


Concerns and Consequences 

While many people appreciate the potential of molecular biotechnology to 
solve important problems in agriculture, medicine, and industry, they rec¬ 
ognize the need to be cautious about its widespread application. Indeed, 
one of the first scientific responses to this new technology was a voluntary 
moratorium on certain experiments that were thought to be potentially haz¬ 
ardous. This research ban was self-imposed by a group of molecular biolo¬ 
gists, including Cohen and Boyer. They were concerned that combining 
genes from two different organisms might accidentally create a novel 
organism with undesirable and dangerous properties. Within a few years, 
however, these apprehensions were allayed as scientists gained laboratory 
experience with this technology and safety guidelines were formulated for 
recombinant DNA research. The temporary cessation of some recombinant 
DNA research projects did not dampen the enthusiasm for genetic engi¬ 
neering. In fact, the new technology continued to receive unprecedented 
attention from both the public and the scientific community. 

Molecular biotechnology can contribute benefits to humanity. It can: 

• Provide opportunities to accurately diagnose, prevent, or cure a 
wide range of infectious and genetic diseases 

• Significantly increase crop yields by creating plants that are resistant 
to insect predation, fungal and viral diseases, and environmental 
stresses, such as short-term drought and excessive heat, and at the 
same time reduce applications of hazardous agrichemicals 

• Develop microorganisms that will produce chemicals, antibiotics, 
polymers, amino acids, enzymes, and various food additives that 
are important for food production and other industries 

• Develop livestock and other animals that have genetically enhanced 
attributes 
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• Facilitate the removal of pollutants and waste materials from the 
environment 

Although it is exciting and important to emphasize the positive aspects 
of new advances, there are also social concerns and consequences that must 
be addressed. The following are some examples. 

• Will some genetically engineered organisms be harmful either to 
other organisms or to the environment? 

• Will the development and use of genetically engineered organisms 
reduce natural genetic diversity? 

• Should humans be genetically engineered? 

• Will new diagnostic procedures undermine individual privacy? 

• Will financial support for molecular biotechnology constrain the 
development of other important technologies? 

• Will the emphasis on commercial success mean that the benefits of 
molecular biotechnology will be available only to wealthy nations? 

• Will agricultural molecular biotechnology undermine traditional 
farming practices? 


FIGURE 1.3 The Farm, by Alexis Rockman. According to the artist, "The Farm explores 
the iconography of agriculture. The Farm is set on a wide-angled field with all its 
usual trappings—animals, fruits, and vegetables. The situation, however familiar, 
is far from predictable. A disproportionately enormous and savage cow has an 
overabundance of teats. The pig is a human organ factory. And the chicken, which 
boasts three pairs of wings and no feathers, is ready for basting. The fruit fly, the 
workhorse of many a genetic study, is present as is a mouse with a human ear car¬ 
tilage projecting from its back....Past, present, and future states are threaded 
together here with barbed wire, woven baskets and DNA....The Farm shows how 
the bodies of these animals have been—and may one day be—transformed to suit 
our aesthetic, medical, gastronomic needs." © Alexis Rockman, 2000. Reprinted 
with the permission of the artist. 
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• Will medical therapies based on molecular biotechnology supersede 
equally effective traditional treatments? 

• Will the quest for patents inhibit the free exchange of ideas among 
research scientists? 

These and many other issues have been considered by government 
commissions, discussed extensively at conferences, and thoughtfully 
debated and analyzed by individuals in both popular and academic publi¬ 
cations. On this basis, rules and regulations have been formulated, guide¬ 
lines have been established, and policies have been created. There has been 
active and extensive participation by both scientists and the general public 
in deciding how molecular biotechnology should proceed, although some 
controversies still remain. 

Molecular biotechnology, with much fuss and fanfare, became a com¬ 
prehensive scientific and commercial venture in a remarkably short time. 
Many scientific and business publications are now devoted to the subject, 
and graduate and undergraduate programs and courses are available at 
universities throughout the world to teach it. Even artists have depicted 
their perception of molecular biotechnology (Fig. 1.3). It could be debated 
whether the early promise of biotechnology has been fulfilled in the way 
that was predicted in a 1987 document published by the U.S. Office of 
Technology Assessment, which declared that molecular biotechnology is "a 
new scientific revolution that could change the lives and futures of . . . citi¬ 
zens as dramatically as did the Industrial Revolution two centuries ago and 
the computer revolution today. The ability to manipulate genetic material 
to achieve specified outcomes in living organisms . . . promises major 
changes in many aspects of modern life." It does, however, offer solutions 
to some serious global problems, including the spread of infectious dis¬ 
eases, the burden of waste accumulation, and food shortages. The potential 
of molecular biotechnology to solve some of these imminent problems is 
the subject of this book. 


SUMMARY 


I n 1973, Stanley Cohen, Herbert Boyer, and their coworkers 
devised a method for transferring genetic information 
(genes) from one organism to another. This procedure, which 
became known as recombinant DNA technology, enabled 
researchers to isolate specific genes and to perpetuate them in 
host organisms. Recombinant DNA technology has been ben¬ 
eficial to many different areas of study. However, its impact on 
biotechnology has been extraordinary. 

Biotechnology, for the most part, uses microorganisms on a 
large scale for the production of commercially important 
products. Before the advent of recombinant DNA technology, 
the most effective way of increasing the productivity of an 
organism was to induce mutations and then use selection pro¬ 
cedures to identify organisms with superior traits. This pro¬ 
cess was not foolproof; it was time-consuming, labor-intensive, 
and costly; and only a small set of traits could be enhanced in 
this way. Recombinant DNA technology, however, provided a 
rapid, efficient, and powerful means for creating microorgan¬ 
isms with specific genetic attributes. Moreover, the tools of 


recombinant DNA technology enable not only microorgan¬ 
isms, but also plants and animals, to be genetically engineered. 
Combining recombinant DNA technology with biotechnology 
created a dynamic and exciting discipline called molecular 
biotechnology. 

From its beginning, molecular biotechnology captured the 
imagination of the public. Many small companies dedicated to 
gene cloning (recombinant DNA technology) were established 
with funding from private investors. Although these biotech¬ 
nology companies took somewhat longer than expected to 
bring their products to the marketplace, a large number of 
recombinant DNA-based products are currently available, 
and many more are expected soon. 

Because of its broad impact, molecular biotechnology has 
been scrutinized carefully for its potential effects on society. 
Some of the concerns that have been raised are its safety, its 
possible negative effects on the environment, and the private 
or public ownership of genetically engineered organisms. 
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REVIEW QUESTIONS 


1. What is biotechnology? 

2. Distinguish between traditional biotechnology and molec¬ 
ular biotechnology. 

3. Describe the basic steps of a bioengineered biotechnology 
process. 

4. What are the shortcomings of the "mutation and selection" 
method for developing enhanced organisms for commercial 
purposes? 

5. Why was the work reported by Cohen and Boyer and their 
coworkers in 1973 considered important? 

6. How did recombinant DNA technology enable the produc¬ 
tion of human insulin? 


7. What are some of the problems that molecular biotech¬ 
nology has the potential to solve? 

8. Discuss the statement "molecular biotechnology is a diverse 
science." 

9. Discuss some of the social concerns that have been raised 
about molecular biotechnology. 

10. Go to http://www.nytimes.com, http://news.yahoo.com, 
or an equivalent news website and conduct a search with the 
word "biotechnology" Describe and discuss three recent bio¬ 
technology news stories. 




DNA, RNA, and Protein 
Synthesis 


T he information encoded in genetic material is responsible for estab¬ 
lishing and maintaining the cellular and biochemical functions of an 
organism. In most organisms, the genetic material is a long double- 
stranded DNA polymer. The sequence of units (deoxyribonucleotides) of 
one DNA strand is complementary to the deoxyribonucleotides of the other 
strand. This complementarity enables new DNA molecules to be synthe¬ 
sized with the same linear order of deoxyribonucleotides in each strand as 
an original DNA molecule. The process of DNA synthesis is called replica¬ 
tion. A specific order of deoxyribonucleotides determines the information 
content of an individual genetic element (gene). Some genes encode pro¬ 
teins, and others encode only ribonucleic acid (RNA) molecules. The pro¬ 
tein-coding genes (structural genes) are decoded by two successive major 
cellular processes: RNA synthesis (transcription) and protein synthesis 
(translation). First, a messenger RNA (mRNA) molecule is synthesized 
from a structural gene using one of the two DNA strands as a template. 
Second, an individual mRNA molecule interacts with other components, 
including ribosomes, transfer RNAs (tRNAs), and enzymes, to produce a 
protein molecule. A protein consists of a precise sequence of amino acids, 
which is essential for its activity. 

Although the deoxyribonucleotide sequences are different in genes 
encoding different functions, and for genes encoding similar functions in 
different organisms, the chemical compositions are the same. This enables 
molecular biotechnologists to transfer genes among a variety of organisms 
to create beneficial products. To understand how this is accomplished, it is 
helpful to know about the structure of DNA, replication, transcription, and 
translation. 


Structure of DNA 

The chemistry of DNA has been studied since 1868. By the 1940s, it was 
known that DNA is made up of individual units called nucleotides that are 
linked to each other to form long chains. A nucleotide consists of an organic 
base (base), a five-carbon sugar (pentose), and a phosphate group (Fig. 
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FIGURE 2.1 Chemical structures of the components of DNA. (A) A representative 
nucleotide. The term "base" denotes any of the four bases (adenine, guanine, cyto¬ 
sine, and thymine) that are found in DNA. The deoxyribose sugar is enclosed by 
dashed lines. The numbers with primes mark the carbon atoms of the deoxyribose 
moiety (B) The bases of DNA. The circled nitrogen atom is the site of attachment of 


the base to the 1' carbon atom of the deoxyribose moiety. 


2.1A). The sugar of DNA is 2 / -deoxyribose because it does not have a 
hydroxyl (OH) group on the 2' carbon; rather, it has a hydroxyl group only 
on the 3' carbon of the sugar moiety. By contrast, in mRNA, the five-carbon 
sugar ribose has hydroxyl groups at both the 2' and 3' carbons of the pentose 
ring. In both DNA and RNA, the phosphate group and base are attached to 
the 5 ' carbon and T carbon atoms of the sugar moiety, respectively. In DNA, 
there are four kinds of bases: adenine (A), guanine (G), cytosine (C), and 
thymine (T) (Fig. 2.IB). The nucleotide subunits of DNA are joined by phos- 
phodiester bonds, with the phosphate group of the 5 ' carbon of one nucle¬ 
otide linked to the 3 ' OH group of the deoxyribose of the adjacent nucleotide 
(Fig. 2.2). A polynucleotide strand has a 3 ' OH group at one end (the 3 ’ end) 
and a 5 ' phosphate group at the other (the 5 ' end). 
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FIGURE 2.3 A rod-ribbon model of 
double-helical DNA. The rods repre¬ 
sent the complementary base pairs, 
and the ribbons represent the deoxyri- 
bose-phosphate backbones. 
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FIGURE 2.2 Chemical structure of a single strand of DNA. 


In 1953, James Watson and Francis Crick, using X-ray diffraction anal¬ 
ysis of crystallized DNA, discovered that DNA consists of two long chains 
(strands) that form a double-stranded helix (Fig. 2.3). The two polynucle¬ 
otide chains of DNA are held together by hydrogen bonds between the 
bases of the opposite strands. Base pairing occurs only between specific, 
complementary bases (Fig. 2.4). A pairs only with T, and G pairs only with 
C. The A T base pairs are held together by two hydrogen bonds, and the 
GC base pairs are held together by three. The number of complementary 
base pairs is often used to characterize the length of a double-stranded 
DNA molecule. For DNA molecules with thousands or millions of base 
pairs, the designations are kilobase pairs and megabase pairs, respectively. 
For example, the DNA of human chromosome 1 is one double-stranded 
helix that has about 263 megabase pairs (Mb). 
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The A T and G C base pairs lie within the interior of the molecule, and 
the 5'-to-3'-linked phosphate and deoxyribose components form the back¬ 
bone of each strand (Fig. 2.4). The two strands of a duplex DNA molecule 
run in opposite directions to each other (antiparallel chains). One chain is 
oriented in a 3'-to-5' direction, and the other is oriented in a 5'-to-3' direc¬ 
tion. Because of the base-pairing requirements, when one strand of DNA 
has, for example, the base sequence 5'-TAGGCAT-3', the complementary 
strand must be 3'-ATCCGTA-5'. In this case, the double-stranded form 
would be 3 '-atccgu\- 5 '• By convention, when DNA is drawn on a horizontal 
plane, the 5' end of the upper strand is on the left. 

Genetic material has two major functions. It encodes the information 
for the production of proteins, and it is reproduced (replicated) with a high 
degree of accuracy to pass the encoded information to new cells. The 
Watson-Crick model of DNA fully meets these important requirements. 
First, because of base complementarity, each preexisting DNA strand can 
act as a template for the production of a new complementary strand. 

FIGURE 2.4 Chemical structure of double-stranded DNA. 
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FIGURE 2.5 DNA replication. (A) The incoming nucleotide is a deoxyribonucleoside 
triphosphate that is directed by DNA polymerase to pair with the complementary 
base of the template strand. ( continued ) 


Consequently, after one round of replication, two daughter molecules are 
produced, with each having the same sequence of nucleotide pairs as the 
original DNA molecule. Second, the sequence of nucleotides of a gene pro¬ 
vides the code for the production of a protein. The linear order of amino 
acids in a protein is determined by the linear sequence of deoxyribonucle- 
otides in a gene. 


DNA Replication 

As predicted by the Watson-Crick model of DNA, each strand of an 
existing DNA molecule acts as a template for the production of a new 
strand, and the sequence of nucleotides of the synthesized (growing) 
strand is determined by base complementarity. During replication, the 
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B 



FIGURE 2.5 (continued) (B) The a phosphate of the incoming nucleotide forms a 
phosphodiester bond with the 3' hydroxyl group of the growing strand. The next 
incoming nucleotide of the growing strand that is complementary to the nucleotide 
of the template strand is positioned by DNA polymerase. 


phosphate group of each incoming nucleotide is enzymatically joined by a 
phosphodiester linkage to the 3' OH group of the last nucleotide that was 
incorporated in the growing strand (Fig. 2.5A). The nucleotides that are 
used for DNA replication are triphosphate deoxyribonucleotides that have 
three consecutive phosphate groups attached to the 5' carbon of the deoxy- 
ribose sugar moiety. The phosphate that is attached to the 5' carbon is des¬ 
ignated the a phosphate, the next phosphate is the (3 phosphate, and the 
third one is the y phosphate (Fig. 2.6). During the replication process, the (3 
and y phosphates are cleaved off as a unit, and the a phosphate is linked to 
the 3’ OH group of the previously incorporated nucleotide (Fig. 2.5B). The 
DNA synthesis machinery of prokaryotes and eukaryotes includes a large 
number of different proteins. Of these, DNA polymerases are responsible 
for binding deoxyribonucleotides, fitting the correct nucleotide into place 
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MILESTONE 


A Structure for Deoxyribose Nucleic Acid 

J. D. Watson and F. H. C. Crick 
Nature 171:737-738,1953 


T he elucidation of the structure of 
the genetic material (DNA) was 
undoubtedly one of the most 
important scientific breakthroughs of 
the 20th century. James Watson and 
Francis Crick, who, with Maurice 
Wilkins, were awarded the Nobel 
Prize in physiology or medicine in 
1962 for this work, suggested "a struc¬ 
ture for the salt of deoxyribonucleic 
acid (D. N. A.). This structure has 
novel features which are of consider¬ 
able biological interest....This struc¬ 
ture has two helical chains each coiled 
around the same axis....Both chains 
follow right handed helices, but. . . 
the sequences of the atoms in the two 
chains run in opposite directions.... 
The novel feature of the structure is 
the manner in which the two chains 
are held together by the purine and 
pyrimidine bases... .They are joined in 
pairs, a single base from one chain 
being hydrogen bonded to a single 


base from the other chain....One of the 
pair must be a purine and the other a 
pyrimidine for bonding to occur....If 
adenine forms one member of a pair, 
on either chain, then.. .the other 
member must be a thymine; similarly 
for guanine and cytosine. The 
sequence of bases on a single chain 
does not appear to be restricted in any 
way... .It has not escaped our notice 
that the specific pairing we have pos¬ 
tulated immediately suggests a pos¬ 
sible copying mechanism for the 
genetic material." In another article a 
few months later (Nature 171:964-967, 
1953), Watson and Crick discussed 
more implications for their model. 
"The phosphate-sugar backbone of 
our model is completely regular but 
any sequence of the pairs of bases can 
fit into the structure. It follows that in 
a long molecule many different per¬ 
mutations are possible, and it there¬ 
fore seems likely that the precise 


sequence of bases is the code which 
carries the genetic information....Our 
model suggests possible explanations 
for a number of other phenomena. For 
example, spontaneous mutations may 
be due to a base occasionally in one of 
its less likely tautomeric forms. Again, 
the pairing between homologous chro¬ 
mosomes at meiosis may depend on 
pairing between specific bases." 

Within a decade of the demonstra¬ 
tion of the double-helical nature of 
DNA with its complementary base 
pairs, the molecular aspects of DNA 
replication were known, the cellular 
processes that are responsible for both 
decoding and regulating the synthesis 
of gene products were understood, 
and many of the kinds of changes that 
lead to altered gene products were 
recognized. From the time of its publi¬ 
cation to the present, the scientific 
impact of the Watson-Crick discovery 
has been pervasive and, for the most 
part, inestimable. As one small 
example, recombinant DNA tech¬ 
nology would not exist without 
knowledge of the structure of DNA. 


according to the base-pairing requirement of the template strand, and 
forming the phosphodiester linkage. 

In bacteria, DNA replication is initiated at a specific region of the (usu¬ 
ally circular) chromosome called the origin of replication (or origin) and, in 
Escherichia coli, proceeds at the rate of about 1,000 nucleotides per second. In 
eukaryotes, a chromosome has many different sites of initiation of replica¬ 
tion. Because of these multiple origins of replication, part of the eukaryotic 
replication process includes enzymatically joining (ligating) segments of 
newly synthesized DNA together with phosphodiester bonds. Furthermore, 
in eukaryotes, a special replication enzyme called telomerase is used for the 
completion of the linear ends (telomeres) of each chromosome. 


Decoding Genetic Information: RNA and Protein 

The vast majority of genes encode information for the production of pro¬ 
tein chains. Proteins are essential polymers that are involved in almost all 
biological functions. They catalyze chemical reactions; transport molecules 
within cells; escort molecules between cells; control membrane permea¬ 
bility; give support to cells, organs, and body structures; cause movement; 
provide protection against infectious agents and toxins; and regulate the 
differential production of other gene products. A protein chain consists of 
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FIGURE 2.6 Structure of a deoxyribonucleoside triphosphate. The phosphate units 
are designated a, p, and y. The a phosphate is directly attached to the 5' carbon of 
the deoxyribose sugar. During DNA synthesis, the a phosphate participates in the 
formation of the phosphodiester bond, and the P and y phosphate groups are 
released as a unit (pyrophosphate). 


a specific sequence of units called amino acids. All amino acids have the 
same basic chemical structure. There is a central carbon atom (the a carbon) 
that has a hydrogen (H), a carboxyl group (COO ), an amino group (NH 3 + ), 
and an R group attached to it (Fig. 2.7A). An R group can be any 1 of 20 
different side chains (groups) that make up the 20 different amino acids 
found in proteins. When R, for example, is a methyl group (CH 3 ), then the 
amino acid is alanine. The amino acids of proteins are designated by either 
a three- or a one-letter notation (see the table following chapter 23). For 
example, alanine is abbreviated Ala or A. In a protein, each amino acid is 
linked to an adjacent amino acid by a peptide bond that joins the carboxyl 
group of one amino acid to the amino group of the adjacent one (Fig. 2.7B). 
The first amino acid of a protein has a free amino group (N terminus), and 
the last amino acid in the polypeptide chain has a free carboxyl group (C 
terminus). 

Proteins range in length from about 40 to more than 1,000 amino acid 
residues. A protein folds into a particular shape (configuration, or confor¬ 
mation) depending on the locations of specific amino acid residues and 
the overall amino acid composition. Individual amino acids have different 
characteristics that are determined by the properties of their side chains, 
and these influence the folding of the protein into a particular three- 
dimensional shape. The shape of a protein in turn helps to determine its 
function. Also, many functional proteins consist of two or more polypep¬ 
tide chains. In some cases, multiples of the same polypeptide chain are 
required for an active protein molecule (homomeric protein). In other 
instances, a set of different protein chains (subunits) assembles to form a 
functional protein (heteromeric protein). Finally, large protein complexes 
that are made up of many different subunits often perform important cel¬ 
lular functions. 

The decoding of genetic information is carried out through interme¬ 
diary RNA molecules that are transcribed from discrete regions of the 
DNA. RNA molecules are linear polynucleotide chains that differ from 
DNA in two important respects. First, the sugar moiety of the nucleotides 
of RNA is ribose, which has hydroxyl groups on both the 2' and 3' carbons 
of the sugar. Second, instead of thymine, the base uracil (U) is found in 
RNA. Most RNA molecules are single stranded, although often there are 
segments of nucleotides within a single chain that are complementary to 
each other and form double-stranded regions (intrastrand pairing) (Fig. 
2.8). The base pairing within a single RNA strand is the same as the base 


FIGURE 2.7 Generalized structures of an 
amino acid and a peptide bond. (A) An 
amino acid. The R represents the loca¬ 
tion of the side chain. (B) A peptide 
bond. The peptide bond is encircled, 
and R1 and R2 represent different side 
chains. 
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pairing between complementary sequences of DNA, except that uracil 
pairs with adenine. Base pairing can occur between two RNA molecules if 
they contain complementary sequences of base pairs. 

The major kinds of RNA molecules that are essential for the decoding 
of genetic information are mRNA, ribosomal RNA (rRNA), and tRNA. The 
production of RNA from DNA is called transcription. In most prokaryotes, 
a single RNA polymerase is responsible for the transcription of all types of 
RNA. In eukaryotic organisms, mRNA, rRNA, and tRNA are each tran¬ 
scribed by a different RNA polymerase. 

In many of its features, transcription resembles replication. Briefly, one 
strand of the DNA of a specific region acts as a template for the synthesis 
of a polymer of ribonucleotides. RNA polymerase sequentially joins, via 
3'-5' phosphodiester linkages, ribonucleotides that are complementary to 
the nucleotides of the template DNA strand (Fig. 2.9). As transcription pro¬ 
ceeds, the newly synthesized RNA is released from the DNA and the DNA 
helix re-forms. Since only specific segments of DNA molecules are tran¬ 
scribed, sets of short stretches of base pairs within the DNA are required to 
ensure that transcription is initiated at the correct nucleotide and that it 
terminates at a specific nucleotide. The sequences that control the initiation 
of transcription usually precede the coding sequence, and the termination 
signal sequences follow it. The DNA segment that precedes a gene is called 
the 5'-flanking or upstream region, and the one following a gene is the 


FIGURE 2.8 Secondary structure of an RNA molecule. The lines represent hydrogen 
bonding between complementary base pairs. The ribose-phosphate backbone is 
omitted. 
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FIGURE 2.9 Schematic representation of transcription. The arrow indicates the direc¬ 
tion of transcription. 


3'-flanking or downstream region. To initiate transcription, RNA poly¬ 
merase binds to a specific sequence of nucleotides upstream of the coding 
sequence known as the promoter. Similarly, a specific sequence of nucle¬ 
otides downstream of the coding sequence, known as the transcriptional 
terminator, signals RNA polymerase to stop RNA synthesis. 

From a molecular perspective, a gene is a specific nucleotide sequence 
that is transcribed into RNA. Structural genes, which make up the vast 
majority of transcribed DNA sequences, encode proteins; however, the ini¬ 
tial transcription product of a structural gene is an mRNA. In prokaryotes, 
a contiguous DNA segment forms a structural gene (the coding region). 
Prokaryotic transcription entails the binding of RNA polymerase to a pro¬ 
moter region, the initiation of transcription at a nucleotide upstream of the 
structural gene, and the cessation of transcription at a termination sequence 
that lies downstream from the coding region (Fig. 2.10). In eukaryotic organ¬ 
isms, a structural gene usually consists of several coding regions (exons) 
that are separated by noncoding regions (introns, or intervening sequences). 
After RNA polymerase has bound to the promoter and the entire eukaryotic 
structural gene is transcribed, the introns are removed from the primary 
transcript, and the exons, in the correct order, are linked (spliced) together 
to form a functional mRNA (Fig. 2.11 and 2.12). In general, exons tend to be 
150 to 300 bases in length, and introns can vary from as few as 40 to over 
10,000 bases. A small number of eukaryotic structural genes lack introns. 


FIGURE 2.10 Schematic representation of a prokaryotic structural gene. The pro¬ 
moter region (p), the site of initiation and direction of transcription (the right- 
angled arrow), and the termination sequence for RNA polymerase (t) are depicted. 
A prokaryotic structural gene is transcribed into mRNA and then directly into 
protein. 
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FIGURE 2.11 Schematic representation of a eukaryotic structural gene. The promoter 
region (p), the site of initiation and direction of transcription (the right-angled 
arrow), and the termination sequence for RNA polymerase (t) are depicted. The 
numbers 1 to 5 mark the exons of the structural gene, and the letters a to d mark the 
introns. The primary transcript is polyadenylated at the 3' end and capped with a 
modified guanine (G) nucleotide at the 5' end. Processing of the primary transcript 
removes the introns. The functional RNA is translated into protein. 


and in some instances, the introns in a primary transcript may be legiti¬ 
mately removed in more than one way in a process known as alternate 
splicing. For example, in one kind of tissue, all the exons of the primary 
transcript may be spliced together to form a functional mRNA, whereas in 
another tissue, the initial transcript may undergo a different pattern of exon 
splicing, with an exon being skipped during the process of intron removal 
and a novel functional mRNA being produced. The exon-skipping mecha¬ 
nism generates different gene products in different tissues from the same 


FIGURE 2.12 Splicing of a eukaryotic primary RNA transcript. The bracketing arrows 
mark the sites that are spliced together after the removal of the intervening RNA 
regions. In this example, introns a and b are spliced out of the primary transcript, 
and exons 1, 2, and 3 are spliced together to form a functional mRNA. 
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FIGURE 2.13 Alternative splicing of a eukaryotic primary RNA transcript. The brack¬ 
eting arrows mark the sites that are spliced together after the removal of the inter¬ 
vening RNA region. In this example, exon 2, flanked by introns a and b, is spliced 
out of the primary transcript, and exons 1 and 3 are spliced together to form a func¬ 
tional mRNA transcript. 


structural gene (Fig. 2.13). For example, in Drosophila, the fruit fly that is 
commonly used in genetic studies, two different mRNAs are produced from 
the doublesex (dsx) gene as a consequence of alternate splicing of the exons 
contained in the gene (Fig. 2.14). One form is produced exclusively in female 
flies and the other only in male flies, and each encodes a protein that has a 
different activity. The protein produced in female flies prevents the develop¬ 
ment of some male-specific characteristics, including male genitalia, and 
conversely, the protein produced in male flies prevents the development of 
female-specific traits. 

Most (>90%) of the RNA in a metabolically active cell is rRNA found 
in ribosomes. Although there can be hundreds to thousands of different 


FIGURE 2.14 Alternative splicing of the doublesex (dsx) primary transcript produces 
two different mRNAs in the fruit fly Drosophila melanogaster. The first three exons 
are spliced together in both male and female flies; however, in male flies, exon 4 is 
skipped, resulting in the splicing of exon 3 to exons 5 and 6. In female flies, exon 3 
is spliced to exon 4, which contains a signal for the addition of a poly(A) tail (pA), 
resulting in a shortened mRNA. The numbered boxes indicate exons, while the red 
and blue lines indicate intron sequences that are spliced out of the primary tran¬ 
script in female and male flies, respectively. Adapted from Maniatis and Tasic, 
Nature 418:236-243, 2002. 


dsx primary transcript 


pA 

A A A I 

2 3 4 


pA 


V V 


1 


2 


3 


4 


5 


6 


V 


Exon splicing in 
female fruit flies 


Exon splicing in 
male fruit flies 


1 

2 

3 

4 


1 

2 

3 

5 

6 


dsx mRNA 


















































26 


CHAPTER 2 


5' 



FIGURE 2.15 Conformation of a charged 
tRNA. The amino acid is attached to 
the 3' end of the tRNA, and the location 
of the anticodon region is enclosed by 
dashed lines. 


mRNAs in a cell, they represent only about 3 to 5% of the cellular RNA, 
while tRNA represents about 4%. The rRNA combines with specific proteins 
to form ribonucleoprotein complexes that make up the large and small ribo- 
somal subunits. During protein synthesis, one large ribosomal subunit and 
one small ribosomal subunit combine to form a ribosome. Both of the ribo¬ 
somal subunits of eukaryotes are larger than those of prokaryotes. 

In addition to thousands of ribosomes, a cell that is actively synthe¬ 
sizing proteins has about 60 different types of tRNA molecules. The tRNA 
molecules range in length from about 75 to 93 nucleotides. Because of 
intrastrand complementary segments of nucleotides, each tRNA molecule 
forms a folded, L-shaped structure (Fig. 2.15). A particular amino acid is 
enzymatically linked by its carboxyl end to the 3' end of a specific tRNA. 
For example, the enzyme arginyl-tRNA synthetase adds the amino acid 
arginine to the tRNA Ar s molecule. There is at least one tRNA for each of the 
20 amino acids found in proteins. After the binding of a particular amino 
acid to its tRNA, the tRNA is said to be "charged." In another part of the 
tRNA molecule, there are three unpaired nucleotides that together are 
called the anticodon sequence. This sequence plays a crucial role in the 
formation of the linear array of amino acids that constitute a protein. 


Translation 

In prokaryotes, which lack a nucleus, the processes of transcription and 
translation are not spatially separated. When a newly synthesized mRNA 
molecule begins to emerge from the RNA polymerase complex, a ribo¬ 
some binds to a specific ribonucleotide sequence near the 5' end of the 
mRNA to initiate translation. Thus, transcription and translation occur 
concurrently in a prokaryotic cell (Fig. 2.16). In contrast, in eukaryotic 
cells, a mature mRNA molecule leaves the nucleus via special pores in the 
nuclear membrane and is bound by ribosomes that either remain in the 


FIGURE 2.16 Concurrent transcription and translation in prokaryotes. When nascent 
mRNA emerges from RNA polymerase (yellow ovals), ribosomes (tan, double- 
lobed shapes) bind to the ribosome-binding site on the mRNA and begin transla¬ 
tion. Translation begins before transcription of mRNA is completed. The arrows 
indicate the direction of transcription. 
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FIGURE 2.17 Transcription and translation are spatially separated in eukaryotes. 
Transcription occurs in the nucleus. Primary transcripts produced by RNA poly¬ 
merase (yellow ovals) are processed to remove introns (grey lines), and the resulting 
mRNA (green lines) is exported from the nucleus via nuclear pores. Ribosomes (tan, 
double-lobed shapes) in the cytoplasm or associated with the endoplasmic retic¬ 
ulum translate the mRNA to produce proteins that remain in the cytoplasm or are 
translocated into the lumen of the endoplasmic reticulum for further processing. 


cytoplasm or associate with the endoplasmic reticulum (Fig. 2.17). In addi¬ 
tion to removal of introns, before the mRNA leaves the nucleus, it is 
capped with a modified guanine nucleotide at the 5' end and a polymer of 
adenine nucleotides is added to the 3' end to form a poly(A) tail (Fig. 2.18). 
The 5' cap and the 3' poly(A) tail aid in the binding of ribosomes to the 
mRNA to begin translation. 

Translation requires the interaction of mRNA, charged tRNAs, ribo¬ 
somes, and a large number of proteins (factors) that facilitate the initiation. 
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FIGURE 2.19 Schematic representation of 
the initiation of translation in prokary¬ 
otes. The mRNA binds to the small 
ribosomal subunit. For some mRNAs, 
the Shine-Dalgarno sequence near the 
5' end of the mRNA base pairs with a 
sequence near the 3' end of the rRNA of 
the small ribosomal subunit. The anti¬ 
codon (UAC) of the initiator fMet- 
tRNA^ 1 base pairs with the start codon 
(AUG) of the mRNA. The large ribo¬ 
somal subunit combines with the initi¬ 
ator tRNA-mRNA-small ribosomal 
subunit complex to form the initiation 
complex. The amino acid methionine of 
the initiator tRNA is formylated (CHO) 
at the amino group in prokaryotes (not 
depicted). After translation, the initial 
formyl-methionine is removed from 
the protein chain. 
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FIGURE 2.18 Modification of the ends of a primary RNA transcript in the nucleus. A 
modified guanine nucleotide cap is added to the 5' end of the transcript, and a 
polyadenylation signal in the RNA sequence specifies the addition of a polymer of 
50 to 250 adenine (A) nucleotides to the 3' end to form a poly(A) tail. The modified 
ends aid in the transport of the mature mRNA from the nucleus and in the binding 
of ribosomes to the mRNA and increase the stability of the mRNA. 


elongation, and termination of the polypeptide chain. In prokaryotes, 
translation is initiated by the binding of a small ribosomal subunit to an 
mRNA by base pairing between a sequence of about 8 nucleotides (a Shine- 
Dalgarno sequence) that is located near the 5' end of the mRNA and a 
complementary sequence near the 3' end of the rRNA of the small ribo¬ 
somal subunit. The 3'-UAC-5' anticodon of a specific initiator tRNA, fMet- 
tRNA fMel , where f represents a formyl moiety that is bound to the 
methionine residue, binds to a 5'-AUG-3' codon (start codon) of the mRNA. 
Proteins (initiation factors) facilitate the binding of the initiator tRNA to the 
mRNA-small ribosomal subunit complex. A large ribosomal subunit then 
combines with the fMet-tRNA fMel -mRNA-small subunit complex to form 
the initiation complex (Fig. 2.19). 

In eukaryotes, translation is initiated by the binding of a particular 
charged initiator tRNA, Met-tRNA Mel , along with initiation factors, to a 
small ribosomal subunit. Next, the 5' capped end of an mRNA, which is 
combined with specific proteins, associates with the initiator tRNA-small 
ribosomal subunit complex, and the complex migrates along the mRNA 
until an AUG sequence (initiator, or start codon) is encountered. The 3' 
poly(A) tail of the mRNA facilitates the interaction between the mRNA and 
the ribosome. When the UAC anticodon sequence of the initiator Met- 
tRNA Mel base pairs with the AUG sequence of the mRNA, the migration 
stops, and the large ribosomal subunit joins the complex to form the initia¬ 
tion complex (Fig. 2.20). 

The elongation and termination phases of translation are very similar in 
prokaryotes and eukaryotes. The elongation process entails the formation of 
a peptide bond between adjacent amino acids, with the order of the amino 
acids being determined by the order of codons of the mRNA (Fig. 2.21). 
More specifically, after the initiation complex is formed, the second set of 
three nucleotides (triplet, or codon) in the mRNA that immediately follows 
the AUG codon dictates the anticodon sequence and, therefore, the charged 
tRNA that will bind to the ribosome complex. Uncharged tRNAs do not 
bind efficiently to ribosomes. For example, if the second nucleotide triplet in 
the mRNA is CUG, then the charged tRNA with the anticodon sequence 
3'-GAC-5' will bind. This charged tRNA carries the amino acid leucine. 
Once this charged tRNA is in place, a peptide bond is formed between the 
carboxyl group of the methionine and the amino group of the leucine. The 
leucine remains bound to its tRNA. Peptide bond formation is catalyzed by 
activity exclusively associated with the large rRNA. The formation of the 
peptide bond "discharges" the initiator tRNA because the bond between the 
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carboxyl group of methionine and its tRNA is cleaved to make the carboxyl 
group available for peptide bond formation. The uncharged tRNA is ejected 
from the ribosomal complex. The methionine-leucine-tRNA Leu -mRNA 
combination shifts (translocates) along the ribosome to the site vacated by 
the initiator tRNA, and as a consequence, the next codon of the mRNA is 
available for binding by another charged tRNA with the appropriate anti¬ 
codon sequence. If the third codon is UUU, then the charged tRNA with an 
AAA anticodon will bind. In this case, the tRNA with an AAA anticodon 
carries the amino acid phenylalanine. Once this charged tRNA is in place, 
the linkage between the carboxyl group of leucine and its tRNA is broken 
and a peptide bond is formed between the carboxyl group of the leucine and 
the amino group of the phenylalanine. After ejection of the uncharged 
tRNA Leu , the "peptidyl" tRNA phe , with the attached methionine-leucine- 
phenylalanine amino acid polymer and the mRNA, is translocated to the 
peptidyl site (P site), and the next codon is available for binding by the 
appropriately charged tRNA in the aminoacyl site (A site). 

The succession of operations that includes binding of a charged tRNA 
by means of anticodon-codon pairing, peptide bond formation, ejection of 
an uncharged tRNA, and translocation continues until all the amino acids 
that are encoded by the mRNA are linked together. Translation occurs in a 
5'-to-3' direction along the mRNA at a rate of about 15 amino acids per 
second. When the 5' end of the mRNA is free of a ribosome, it can combine 
with another initiation complex. A single mRNA can be translated simulta¬ 
neously by a number of ribosomes, with each ribosome producing a poly¬ 
peptide chain. In rapidly growing E. coli cells, the entire population of 
approximately 20,000 ribosomes per cell is capable of producing about 
30,000 polypeptides per minute. Parenthetically, the average bacterial 
structural gene with about 1,000 base pairs (bp) encodes a protein with 333 
amino acids because 3 bases code for each amino acid. With a mean molec¬ 
ular weight of an amino acid being about 105, the molecular weight of an 
average bacterial protein is about 35,000. 

The elongation process continues until a UAA, UAG, or UGA codon 
(stop codon, or termination codon) is encountered (Fig. 2.22). There are no 
naturally occurring tRNAs with anticodons that are complementary to these 
codons. However, a protein(s) (termination factor, or release factor) recog¬ 
nizes a stop codon and binds to the ribosome. After binding of a termination 
factor, the bond between the last tRNA, which has the complete chain of 
amino acids linked to it, and its amino acid is broken, resulting in dissocia¬ 
tion of the uncharged tRNA, the complete protein, and the mRNA from the 
ribosome. In addition, a ribosome-releasing factor separates the ribosomal 
subunits so that they can be recycled for the translation of other mRNAs. 

After translation, a protein may be modified in various ways. In both 
prokaryotes and eukaryotes, the methionine at the N terminus is cleaved 
from most proteins, leaving the second encoded amino acid as the 
N-terminal moiety. In eukaryotes, certain proteins are selectively cleaved at 
defined sites (processed) to make smaller protein chains that have discrete 
functions. In other instances, especially in eukaryotes, phosphate groups, 
lipids, carbohydrates, or other low-molecular-weight groups are enzymati¬ 
cally added to certain amino acids of a protein. These chemical additions 
(posttranslational modifications) create proteins that mediate specific cel¬ 
lular activities. 


Met 



Met 




FIGURE 2.20 Schematic representation 
of the initiation of translation in eukary¬ 
otes. The initiator tRNA (Met-tRNA Mel ) 
binds to a small ribosomal subunit, and 
then the mRNA moves along the com¬ 
plex until the anticodon (UAC) of the 
initiator tRNA base pairs with the start 
codon (AUG) of the mRNA. The large 
ribosomal subunit combines with the 
mRNA-initiator tRNA-small ribo¬ 
somal subunit complex to form the 
initiation complex. 
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FIGURE 2.21 Schematic representation of the elongation phase of translation. (A) The 
second codon (CUG) base pairs with the anticodon (GAC) of Leu-tRNA^ 11 . (B) The 
methionine of the initiator tRNA is joined to the leucine of Leu-tRNA Leu by a pep¬ 
tide bond, and the uncharged initiator tRNA is ejected from the ribosome. (C) 
Translocation of the peptidyl-tRNA and the mRNA to the peptidyl site from the 
aminoacyl site opens the aminoacyl site for the next codon (UUU). (D) The third 
codon (UUU) base pairs with the anticodon (AAA) of Phe-tRNA phe . (E) The leucine 
of the peptidyl-tRNA is joined to the phenylalanine of the Phe-tRNA phe by a peptide 
bond, and the uncharged Leu-tRNA Leu is ejected from the ribosome. (F) Translocation 
of the peptidyl-tRNA and mRNA to the peptidyl site from the aminoacyl site opens 
the aminoacyl site for the next codon and codon-anticodon interaction. 


The complete genetic code consists of 64 codons. Three of these codons 
are reserved for stops, and one (AUG) is used for initiation (Table 2.1). 
When a methionine residue occurs internally in a protein, the codon AUG 
is recognized by another Met-tRNA Mel that is neither formylated nor the 
initiator tRNA. There is one codon (UGG) for the amino acid tryptophan. 
For the rest of the amino acids that are found in proteins, there are at least 
two, usually four, and sometimes as many as six codons. For example, there 
are six codons (UUA, UUG, CUU, CUC, CUA, and CUG) for the amino 
acid leucine. Different codons are used to different extents in different 
organisms (Table 2.1). Of the four codons for glycine, GGA is used about 
26% of the time by human structural genes and about 9% of the time by 
protein-coding genes of E. coli. The stop codons are also used to different 
extents in different organisms. In humans, the frequencies of usage of 
UAA, UAG, and UGA are 0.22, 0.17, and 0.61, respectively, whereas in E. 
coli, they are 0.62, 0.09, and 0.30, respectively. The differences in codon 
usage notwithstanding, the genetic code, with a few rare exceptions, is the 
same in all organisms. 
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H 2 N — Met — Leu — Phe — Asp — Tyr — Ala — Gly — Ala—Cys— Val — COOH 



FIGURE 2.22 Schematic representation of the termination of translation. The stop 
codon (UAG) interacts with a termination factor that leads to the termination of 
translation. The last tRNA is cleaved from the peptide chain and ejected. The 
mRNA and the finished peptide are released. The ribosomes are prepared for recy¬ 
cling by a ribosome-releasing factor. 


TABLE 2.1 Genetic code and codon usage in E. coli and humans 


Codon 

Amino acid 

Frequency of use in: 

E. coli 

Humans 

GGG 

Glycine 

0.13 

0.23 

GGA 

Glycine 

0.09 

0.26 

GGU 

Glycine 

0.38 

0.18 

GGC 

Glycine 

0.40 

0.33 

GAG 

Glutamic acid 

0.30 

0.59 

GAA 

Glutamic acid 

0.70 

0.41 

GAU 

Aspartic acid 

0.59 

0.44 

GAC 

Aspartic acid 

0.41 

0.56 

GUG 

Valine 

0.34 

0.48 

GUA 

Valine 

0.17 

0.10 

GUU 

Valine 

0.29 

0.17 

GUC 

Valine 

0.20 

0.25 

GCG 

Alanine 

0.34 

0.10 

GCA 

Alanine 

0.22 

0.22 

GCU 

Alanine 

0.19 

0.28 


(continued) 
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TABLE 2.1 Genetic code and codon usage in E. coli and humans (continued) 


Codon 

Amino acid 

Frequency of use in: 

E. coli 

Humans 

GCC 

Alanine 

0.25 

0.40 

AAG 

Lysine 

0.24 

0.60 

AAA 

Lysine 

0.76 

0.40 

AAU 

Asparagine 

0.39 

0.44 

AAC 

Asparagine 

0.61 

0.56 

AUG 

Methionine 

1.00 

1.00 

AUA 

Isoleucine 

0.07 

0.14 

AUU 

Isoleucine 

0.47 

0.35 

AUC 

Isoleucine 

0.46 

0.51 

ACG 

Threonine 

0.23 

0.12 

ACA 

Threonine 

0.12 

0.27 

ACU 

Threonine 

0.21 

0.23 

ACC 

Threonine 

0.43 

0.38 

UGG 

Tryptophan 

1.00 

1.00 

UGU 

Cysteine 

0.43 

0.42 

UGC 

Cysteine 

0.57 

0.58 

UGA 

Stop 

0.30 

0.61 

UAG 

Stop 

0.09 

0.17 

UAA 

Stop 

0.62 

0.22 

UAU 

Tyrosine 

0.53 

0.42 

UAC 

Tyrosine 

0.47 

0.58 

uuu 

Phenylalanine 

0.51 

0.43 

uuc 

Phenylalanine 

0.49 

0.57 

UCG 

Serine 

0.13 

0.06 

UCA 

Serine 

0.12 

0.15 

UCU 

Serine 

0.19 

0.17 

UCC 

Serine 

0.17 

0.23 

AGU 

Serine 

0.13 

0.14 

AGC 

Serine 

0.27 

0.25 

CGG 

Arginine 

0.08 

0.19 

CGA 

Arginine 

0.05 

0.10 

CGU 

Arginine 

0.42 

0.09 

CGC 

Arginine 

0.37 

0.19 

AGG 

Arginine 

0.03 

0.22 

AGA 

Arginine 

0.04 

0.21 

CAG 

Glutamine 

0.69 

0.73 

CAA 

Glutamine 

0.31 

0.27 

CAU 

Histidine 

0.52 

0.41 

CAC 

Histidine 

0.48 

0.59 

CUG 

Leucine 

0.55 

0.43 

CUA 

Leucine 

0.03 

0.07 

CUU 

Leucine 

0.10 

0.12 

cue 

Leucine 

0.10 

0.20 

UUG 

Leucine 

0.11 

0.12 

UUA 

Leucine 

0.11 

0.06 

CCG 

Proline 

0.55 

0.11 

CCA 

Proline 

0.20 

0.27 

ecu 

Proline 

0.16 

0.29 

CCC 

Proline 

0.10 

0.33 
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Regulation of mRNA Transcription in Bacteria 

In bacteria, the production of amino acids, nucleotides, and other essential 
metabolites; replication; transcription; translation; cell growth; catabolic 
pathways; energy-generating systems; and responses to environmental 
changes all depend on proteins. However, the energy resources of a cell are 
not sufficient to support the transcription and translation (expression) of all 
of its structural genes at the same time. Consequently, only those genes that 
encode proteins that maintain basic cellular functions are expressed con¬ 
tinuously. The transcription of the remaining structural genes is regulated. 
If a protein(s) is required by a cell, then a signaling system initiates tran¬ 
scription (the "on" state) of the pertinent structural gene(s). Alternatively, 
if a protein(s) is not needed, transcription of the protein-coding gene(s) is 
turned off (the "off" state). 

Frequently bacterial structural genes that encode proteins required for 
several steps in a single metabolic pathway are contiguous in the chromo¬ 
some. This arrangement is called an operon. Generally, an operon is under 
the control of a single promoter, and its transcription gives rise to one large 
mRNA. The placement of a stop codon for one protein close to the start 
codon of the next protein within a multigene mRNA generates a set of dis¬ 
crete proteins during translation. Note that a ribosome-binding site (a 
Shine-Dalgamo sequence) precedes each start codon. 

For many of the structural genes of E. coli, the promoter region has 
two DNA-binding sites for RNA polymerase; more specifically the 
binding sites are recognized by the component of the RNA polymerase 
complex known as the sigma factor. Frequently, one of these sites tends to 

1 TATAAT ° i j TTGACA 

have the sequence atatta (a Pribnow box), and the other is usually aactgt. 
The Pribnow box and the TTGACA sequence are located about 10 bp (the 
-10 region) and 35 bp (the -35 region), respectively upstream from the 
site of initiation of transcription (the +1 nucleotide) (Fig. 2.23). A promoter 
containing a Pribnow box and the TTGACA sequence is recognized by the 
sigma factor RpoD (also called sigma-70 [a 70 ] because it has a molecular 
mass of 70 kilodaltons). Many bacteria are capable of producing several 
different sigma factors, each of which recognizes a different promoter 
sequence. For example, E. coli can produce seven different sigma factors, 
each of which initiates the transcription of a specific subset of genes, 
although there is some overlap among these sigma factors in the promoter 
sequences that they recognize (Table 2.2). RpoD, together with RNA poly¬ 
merase, binds to the promoters of genes that encode proteins or RNA 
molecules that are required for essential or "housekeeping" processes. 
Other sigma factors direct RNA polymerase to the promoters of genes that 
encode more specialized functions, such as proteins required for adapta¬ 
tion to environmental stresses (RpoS) or for nitrogen metabolism 
(RpoN). 

Nucleotide sequences in and around the RNA polymerase-binding site 
often play an essential role in determining whether an operon is transcribed. 
This regulatory region is usually referred to as the operator region. A plethora 
of elaborate regulatory systems that control the on and off states of various 
operons have evolved. For example, when a regulatory protein called a 
repressor binds to an operator region and prevents RNA polymerase from 
binding to the promoter or moving along the DNA, transcription is blocked 
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FIGURE 2.23 Schematic representation of a bacterial transcription unit. (A) The struc¬ 
tural genes (A, B, C, and D) of an operon are under the transcriptional control of an 
operator (o) and a promoter (p) region. RNA polymerase binds to regions that are 
10 (-10) and 35 (-35) bp from the site of initiation of transcription (+1) (right-angled 
arrow). The t denotes a transcription termination signal sequence. After transcrip¬ 
tion of the operon, the proteins of the operon (a, (3, y, and 8) are produced during 
translation. (B) Same as panel A, except that binding of RNA polymerase to the 
promoter region is shown. 


(Fig. 2.24). However, in some cases, specific low-molecular-weight com¬ 
pounds (effectors) bind to a particular repressor protein and change its 
conformation, thus preventing it from binding to its operator region. When 
an effector-repressor complex fails to bind to an operator region, RNA poly- 


TABLE 2.2 Sigma factors produced by E. coli 


Sigma factor 

Synonym(s) 

-35 region 

-10 region 

Function of genes controlled by sigma factor 

RpoD 

o 70 

TTGACA 

TATAAT 

Most genes required during growth phase 

RpoS 

O 38 

— 

CTACACT 

Stationary phase and stress response 

RpoN 

O 54 

YTGGCAC (-24 region) 

TTGCW (-12 region) 

Nitrogen metabolism 

RpoH 

O 32 

TCTCNCCCTTGAA 

CCCCATNTA 

Heat shock response 

RpoF 

o 28 , FliA 

CTAAA 

CCGATAT 

Flagellum synthesis and chemotaxis 

RpoE 

G 24 

GAANTT 

YCTGA 

Response to misfolded proteins in the 
periplasm 

RpoFecI 

o 19 

GAAAAT 

— 

Iron transport 


The DNA-binding sequences shown are the consensus sequences for each sigma factor recognition site. A consensus sequence represents the nucleotides most 
frequently found at each position as determined by comparing the nucleotide sequences of many different promoters recognized by the sigma factor. Rarely, how¬ 
ever, are all of the nucleotides in a given recognition site exactly as represented by the consensus sequence. N = A, T, G, or C; Y = T or C; W = A or T; —> no consensus 
sequence. 
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The Dependence of Cell-Free Protein Synthesis in 
E. coli upon Naturally Occurring or Synthetic 
Polyribonucleotides 

M. W. Nirenberg and J. H. Matthaei 
Proc. Natl. Acad. Sci. USA 47 : 1588 - 1602 , 1961 

An Intermediate in the Biosynthesis of 
Polyphenylalanine Directed by Synthetic Template RNA 

M. W. Nirenberg, J. H. Matthaei, and O. W. Jones 
Proc. Natl. Acad. Sci. USA 48 : 104 - 109,1962 


B y the early 1960s, due in large 
part to the efforts of Watson and 
Crick, it was known that the 
sequence of bases in DNA contained 
the genetic code. However, an out¬ 
standing question was how nucleotide 
sequences were decoded to produce 
proteins with precise sequences of 
amino acids. Specifically, how could 
the arrangement of four different 
nucleotides (i.e.. A, T, G, and C) in a 
sequence determine the combination 
and linear order of 20 different amino 
acids that were known to be found in 
proteins? In a series of papers pub¬ 
lished in 1961 and 1962, Marshall W. 
Nirenberg and his colleague J. 
Heinrich Matthaei at the National 
Institutes of Health described a cell- 
free system that allowed them to add 
various components that might be 
required to synthesize a protein to a 
reaction under controlled conditions. 


Included in the reaction mixture were 
ribosomes and soluble RNA (later 
determined to be the source of amino- 
acyl-tRNAs) extracted from E. coli and 
radiolabeled amino acids. Only when 
a simple, synthetic RNA polymer, 
polyuridylic acid [poly(U)] was added 
to the reaction, was radiolabeled 
L-phenylalanine incorporated into a 
polypeptide identified as poly-L-phe- 
nylalanine. "The synthetic polynucle¬ 
otide appears to contain the code for 
the synthesis of a protein containing 
only one amino acid....Polyuridylic 
acid appears to function as a synthetic 
template or messenger RNA." 
Polyphenylalanine could not be pro¬ 
duced from any other polynucleotide 
tested, e.g., poly(A), poly(C), or 
poly(A-U). Furthermore, phenylalanyl- 
tRNA was required for the transfer of 
phenylalanine to the polypeptide 
chain. Nirenberg, Matthaei, and Jones 


suggested that, "since a sequence of 
one or more uridylic acid residues in 
poly-U is the code for phenylalanine 
in this system, it is probable that phe- 
nylalanine-sRNA [tRNA] contains a 
complementary sequence of one or 
more adenylic acid residues which 
base-pair with the template." Not only 
had the first "word" of the genetic 
code been discovered, but also a 
mechanism was proposed by which 
the "letters" in mRNA were inter¬ 
preted to produce a protein. In a sub¬ 
sequent paper (Proc. Natl. Acad. Sci. 
USA 48:666-677,1962), Matthaei et al. 
showed that an amino acid coding 
unit contains a minimum of three 
nucleotides and that the genetic code 
is at least partially degenerate (two 
coding units composed of different 
nucleotides specified the same amino 
acid) and contains "nonsense words" 
that do not encode amino acids but 
likely "serve as periods." Nirenberg, 
together with H. G. Khorana, who 
showed that polyribonucleic acids 
with precise sequences of nucleotides 
could produce polypeptides with pre¬ 
dicted sequences of amino acids, and 
R. W. Holley, who worked out the 
structure and function of tRNA, was 
awarded the Nobel Prize in physi¬ 
ology or medicine in 1968 for the 
cracking of the genetic code. 


merase can bind to the promoter and move along the DNA, and the operon 
is transcribed. Effector molecules that block repression are generally broken 
down by cellular activity. When the levels of an effector molecule are 
reduced, the repressor proteins can bind to the operator region and the off 
state is reestablished. In many cases an operator region is specific for its 
particular operon; however, there are many examples of different operons 
with similar operator sequences that are controlled by the same regulatory 
protein. The proteins encoded in these operons are usually involved in 
related cellular processes. 

By way of illustration, if a cell has the enzymatic capability of catabo- 
lizing a particular sugar, it is a waste of cellular resources to synthesize the 
enzymes that break down the sugar if the sugar is not present in the 
medium. On the other hand, if the sugar is available and is the only 
carbon source, then the enzymes that are responsible for its cellular utili¬ 
zation are essential. In this case, the sugar acts as an effector, preventing 
the repressor from binding to the operator region and thus enabling the 
operon to be transcribed. When the amount of sugar is depleted, the 
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FIGURE 2.24 Induction of the on state for transcription of a bacterial operon. The 
repressor protein (R) binds to the operator region and blocks transcription. The 
binding of an effector molecule (E) to the repressor protein changes the conforma¬ 
tion of the repressor protein. The repressor protein-effector (RE) complex cannot 
bind to the operator region; thus, RNA polymerase can transcribe the operon. 


repressor protein binds to the operator region and prevents transcription 
of the operon. 

For other operons, transcription may be the normal state because the 
repressor protein is inactive. In these cases, a specific effector molecule 
(corepressor) attaches to an inactive repressor and causes a conformational 
change that enables the repressor-corepressor complex to bind to its cor¬ 
responding operator region and turn off the transcription of the operon 
(Fig. 2.25). When the concentration of the corepressor decreases, the on 
state for the operon resumes because the repressor by itself is unable to 
bind to the operator. 


FIGURE 2.25 Induction of the off state for transcription of a bacterial operon. The 
binding of a corepressor molecule (C) to an inactive repressor protein (IR) changes 
the conformation of the repressor protein. The corepressor-repressor protein com¬ 
plex (IR-C) binds to the operator region and blocks transcription of the operon by 
RNA polymerase. 
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The regulation of transcription by a repressor protein is called a nega¬ 
tively controlled system. In a positively controlled system, a regulatory 
protein increases the rate of transcription instead of repressing it. Briefly, a 
protein (activator protein, or activator) binds to the operator region and 
attracts RNA polymerase to the adjacent promoter region; consequently, 
transcription of the operon is enhanced (Fig. 2.26). Abound activator pro¬ 
tein does not block the movement of RNA polymerase along the DNA. 
Rather, it can be viewed as "greasing the wheels" for transcription. 
Activators are specific for particular activator sites. In some instances, an 
effector molecule converts an active activator to an inactive one and dimin¬ 
ishes the rate of transcription of the operon (Fig. 2.26). In other cases, an 
effector molecule activates an inactive activator by altering the conforma¬ 
tion of the activator so that it has an increased binding affinity for the 
operator sequence. Understanding how the transcription of a bacterial 
operon is regulated requires detailed molecular studies of mutations that 
affect a regulatory system and in vitro analyses of the various protein- and 
DNA-binding sites. 


Regulation of mRNA Transcription in Eukaryotes 

Most active eukaryotic cells transcribe a common (basal) set of structural 
genes that maintain routine (housekeeping) cellular functions. In some 
cells, specific structural genes are transcribed and translated, giving the 
tissue or organ its unique properties. For example, the genes that encode 
the a and (3 subunits of adult hemoglobin are expressed only in the cells 
that develop into red blood cells. The numbers of cell-specific mRNA tran¬ 
scripts range from a few sequences in some cells to dozens of different 
sequences in others. The ability of cells to turn on (activate) or turn off 
(repress) transcription of particular structural genes is essential for main¬ 
taining cell specificity, for conserving cellular energy, and for enabling cells 
to respond to developmental cues or environmental changes. 

There are a number of diverse, highly specific processes that activate or 
repress the transcription of various eukaryotic structural genes. In general. 


FIGURE 2.26 Activation and deactivation of a bacterial operon. An activator protein 
(Act) binds to an activating site and enhances the rate of transcription of the operon. 
When an effector molecule (E) binds to the activator protein, the Act-E complex 
does not bind to the activating site. The rate of transcription of the operon is dimin¬ 
ished when the activating site is not occupied by the activating protein. 
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the control of transcription in eukaryotes is mediated by proteins that are 
collectively classified as transcription factors. Many transcription factors 
bind directly to DNA sequences that are frequently less than 10 bp in 
length. The naming of these protein-binding sites is idiosyncratic. However, 
for the most part, they are called boxes, DNA modules, initiator elements, 
or response elements. Unlike the situation in prokaryotes, operons are 
almost never found in the genomes of eukaryotes. Consequently, each 
eukaryotic structural gene has its own set of response elements. Moreover, 
in addition to DNA-protein interactions, protein-protein associations are 
important for regulating eukaryotic transcription. 

In addition to specific response elements, a representative eukaryotic 
structural gene has a promoter sequence that binds to a core set of proteins 
that are minimally required for transcription initiation. A eukaryotic pro¬ 
moter consists of a TATA sequence (TATA box, or Hogness box), a CCAAT 
sequence ("cat" box), and a sequence of repeated GC nucleotides (GC box) 
that lie about -25, -75, and -90 bp, respectively, from the site of initiation 
of transcription (+1) (Fig. 2.27). The first step in the initiation of transcrip¬ 
tion of eukaryotic structural genes with a TATA promoter is the binding of 
transcription factor IID (TFIID, or TATA-binding protein [TBP]), which is a 
complex of at least 14 proteins, to an available TATA sequence. Subsequently, 
other transcription factors bind to TFIID and the DNA adjacent to the TATA 
box. Then, RNA polymerase II, which is oriented toward the structural 
gene, binds to the transcription complex. With the aid of additional tran¬ 
scription factors, transcription is initiated at the correct starting point (the 
+1 nucleotide) (Fig. 2.28). Clearly, if a TATA sequence is deleted or grossly 
altered, then transcription of the structural gene cannot occur. Transcription 
factors that are specific for the CCAAT and GC response elements have 
been identified. In addition, enhancer sequences that increase the rate of 
transcription of structural genes are located hundreds or even thousands of 
base pairs from the +1 base pair. Folding, looping, or bending of the chro¬ 
mosomal DNA may bring DNA regions, which in the elongated state are 
far apart, close to one another. Also, transcription factors that bind to cer¬ 
tain enhancers or response elements may form a chain of proteins that 
create bridges from one DNA site to another. 

Some repressed (nonexpressed) structural genes are activated by a cas¬ 
cade of events that is triggered by a specific extracellular signal, such as a 
temperature increase or the presence of a hormone. For example, a hor¬ 
mone that is released into the circulatory system comes into contact with a 
specific cell type that has a receptor on its outer surface that binds the hor¬ 
mone and facilitates the entry of the hormone into the cell. Once inside the 


FIGURE 2.27 Promoter and initiator elements of some eukaryotic structural genes. 
The negative numbers designate the locations of nucleotide pairs in the DNA that 
lie upstream from the site of initiation (+1) of transcription. The right-angled arrow 
indicates the site of initiation and the direction of transcription. The locations of the 
transcription elements are not drawn to scale. 
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FIGURE 2.28 Formation of an RNA polymerase II transcription initiation complex at 
a TATA box. Transcription factor TFIID binds to a TATA box, and in sequence, other 
transcription factors and RNA polymerase II bind to form a protein aggregate that 
is responsible for initiating transcription. The right-angled arrow indicates the site 
of initiation and the direction of transcription. 



cell, the hormone interacts with a cytoplasmic protein and changes the 
conformation of the protein. In this altered state, the protein is now able to 
enter the nucleus, where it binds to an exclusive response element that 
initiates transcription of the target gene. 

Some proteins bind to response elements and prevent transcription. 
For example, there is a class of about 18 vertebrate genes that are actively 
transcribed in nerve cells (neurons) and turned off in nonneuronal cells. 
Each of these neuron-active genes has a 24-bp response element that lies 
upstream of its transcription initiation site. This DNA sequence is called a 
neuron-restrictive silencer element (NRSE). In nonneuronal cells, a protein 
called neuron-restrictive silencer factor (NRSF) is synthesized, binds to 
each NRSE, and prevents transcription of each member of this set of genes. 
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Conversely, NRSF is not produced by neuronal cells, and therefore, each 
gene with an NRSE is transcribed. 

On the whole, the regulation of transcription in eukaryotes is complex. 
A structural gene may have a number of different response elements that 
can be activated in different cell types by different signals at different times 
in the life cycle of an organism. Alternatively, some structural genes are 
under the preferential control of a unique transcription factor. For the off 
state, specific proteins can interact with certain response elements and pre¬ 
vent transcription, or in a more general way, some proteins obstruct tran¬ 
scription by binding to the transcription complex either before initiation or 
during the elongation process. 

More generalized control of gene expression that influences larger 
regions of the chromosomes is mediated by the state of chromosome struc¬ 
ture. A very large amount of chromosomal DNA must be packaged into 
the nucleus of a eukaryotic cell. To facilitate this, the DNA is bound by 
specific proteins called histones that interact with each other to compact 
(condense) the chromosomes into a smaller volume. DNA with its associ¬ 
ated packaging proteins is known as chromatin. Some regions of the chro¬ 
mosomes are tightly packed (heterochromatin), while other regions are 
less condensed (euchromatin). Highly condensed DNA is less accessible to 
regulatory proteins that activate transcription, and therefore, the genes in 
these regions are usually not expressed or are expressed only at a low 
level. Chromatin structure, however, is dynamic, and condensed regions 
can be "relaxed" by the addition of chemical groups, such as an acetyl 
group to amino acids in the packaging proteins or methyl groups to spe¬ 
cific sites in the nucleotide sequences to which the proteins bind. 
Unpacking of the chromatin generally increases transcription of genes in 
the region. 


Protein Secretion Pathways 

Bacteria and eukaryotic cells have specialized systems for exporting certain 
proteins (secretory proteins) to the external environment. Generally, secre¬ 
tory proteins are required for acquiring nutrients, cell-to-cell communica¬ 
tion, protection, and structures that reside on the outer surface of the cell 
membrane. The primary impediment to the release of a secretory protein is 
a membrane. The processes that facilitate secretion of proteins through 
such a formidable barrier are similar among all organisms, although there 
are significant differences between organisms. For example, gram-negative 
and gram-positive bacteria do not have the same secretory pathways. A 
secreted protein in gram-negative bacteria must pass through an inner 
membrane, a periplasmic space, and an outer membrane to exit the cell, 
whereas in gram-positive bacteria, secretory proteins are transported only 
across a single cytoplasmic membrane. In contrast, the secretory system in 
higher organisms is more complex. Unlike prokaryotic proteins, many 
eukaryotic proteins require a number of highly specific modifications, such 
as glycosylation, acetylation, sulfation, and phosphorylation, to produce 
functional secretory proteins. Some of these protein modifications and 
various processing steps are carried out in the endoplasmic reticulum, and 
others take place in the Golgi apparatus, where proteins are also sorted 
according to their final cellular destinations, including those that exit 
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through the cell membrane. The property that distinguishes a protein that 
remains in the cytoplasm from one that is secreted is often an amino acid 
sequence (a signal peptide, signal sequence, leader sequence, or leader 
peptide) at its N terminus. 

In gram-positive bacteria, the signal peptide of some secretory proteins 
makes direct contact with a membrane-bound assembly of proteins (a 
secretion complex, or Sec complex) that facilitates the passage of these pro¬ 
teins through the membrane and their release to the external environment 
(Fig. 2.29). Alternatively, for other secretory proteins, a group of proteins 
called a signal recognition complex binds to a signal peptide, and this com¬ 
bination attaches to a membrane-bound signal recognition complex 
receptor before making contact with the Sec complex (Fig. 2.29). In both 
cases, the secretory protein is translocated through a channel formed by the 
Sec complex, and its release depends on removal of the signal peptide by a 
membrane-bound enzyme called a signal peptidase. Subsequently, pro¬ 
teins that have crossed the cytoplasmic membrane readily pass through the 
porous cell wall, where they encounter metal ions and other components 
that promote proper folding and molecular stabilization. 

Gram-negative bacteria have multiple pathways for the secretion of 
various proteins. Some of these systems (Sec-dependent pathways) use the 


FIGURE 2.29 Schematic representation of secretion in gram-positive bacteria. A 
signal recognition particle (SRP) binds to the signal peptide of a secretory protein, 
and this complex binds to a membrane protein that directs the secretory protein (1) 
to the Sec complex. There is also an SRP-independent pathway (2), where a signal 
peptide alone makes contact with the Sec complex. The secretory protein is translo¬ 
cated through a channel within the Sec complex (3), and the signal peptide is 
removed by a signal peptidase(s). Proper folding of the secretory protein occurs as 
it passes through the cell wall (4). 
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FIGURE 2.30 Schematic representation of a type II secretion pathway in gram-nega¬ 
tive bacteria. The SecB protein binds to a secretory protein in the cytoplasm (1), 
SecB attaches to the SecA protein that is part of the Sec complex of the inner mem¬ 
brane (2), and the secretory protein is translocated through the inner membrane (3). 
A signal peptidase removes the signal peptide, and the secretory protein is properly 
folded in the periplasm (4); the secretory protein combines with the Gsp complex 
(5); and it is translocated to the external environment (6). 


same membrane-bound Sec complex for transmitting a secretory protein 
through the inner membrane into the periplasm. Collectively, the Sec- 
dependent pathways are designated the general secretion pathway. In 
these instances, a cytoplasmic protein (SecB) binds an amino acid sequence 
(domain) of a secretory protein that has a signal peptide. In turn, the SecB 
protein combines with a protein (SecA) of the membrane-bound Sec com¬ 
plex. The secretory protein is translocated into the periplasm, and the 
signal peptide is removed. At this point, the secretory protein encounters 
various periplasmic proteins that ensure proper folding. Thereafter, Sec- 
dependent secretory proteins exit through the outer membrane by different 
routes. A region of some proteins is capable of forming a channel in the 
outer membrane that allows part of the remaining protein to be selectively 
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FIGURE 2.31 The type III secretion system is made up of about 20 different proteins 
that form a continuous channel through the inner and outer membranes of gram¬ 
negative bacteria. The type III secretion system is used by bacterial pathogens to 
secrete toxins and other proteins into plant and animal host cells. A hollow needle¬ 
like protein structure extends from the bacterial surface into the host cell. 

extruded (autotransporter pathway). In these cases, proteolytic cleavage 
releases the functional portion of the protein to the external environment. 
Other proteins are able to pass through an outer membrane channel that is 
formed by a separate protein (single accessory pathway). Another pathway 
(chaperone/usher pathway) is used by specific proteins that form fimbriae 
on the surface of the bacterial cell. A fourth general secretion pathway 
branch called the type II secretion pathway consists of a protein complex 
(the Gsp complex) that spans the periplasmic space and forms a channel 
through the outer membrane. Most secreted proteins pass through the type 
II pathway. In these cases, secretory proteins earmarked for the type II 
pathway are first transported to the periplasmic space via the Sec- 
dependent pathway, where they bind to the Gsp complex and are shunted 
through the outer membrane (Fig. 2.30). Other Sec-dependent pathways 
have been found in various gram-negative bacteria. In contrast to the type 
II pathway, the type I and type III secretion pathways are Sec independent, 
and each has its own protein complex that extends from the inner to the 
outer membrane, forming a continuous channel from the bacterial cyto¬ 
plasm to the external environment. For example, bacterial flagellar proteins 
reach the outer surface of the cell by means of a type III secretion pathway. 
Type III secretion pathways are often used by bacterial pathogens to secrete 
bacterial proteins into the cytoplasm of eukaryotic host cells (Fig. 2.31). 
Signal peptides are recognized by the Sec-independent systems but are not 
necessarily cleaved during the secretion process. 
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FIGURE 2.32 Schematic representation of the secretion pathway in eukaryotes. (A) 
A signal recognition particle (SRP) binds to the signal sequence of a secretory pro¬ 
tein. (B) The SRP attaches to an SRP receptor on the endoplasmic reticulum (ER) 
membrane. (C) The secretory protein is translocated into the lumen of the ER, and 
a signal peptidase removes the signal sequence. (D) The secretory protein is 
folded, partially modified, and packaged in a transport vesicle intended for the 
Golgi network. (E) The ER-released vesicle carrying the secretory protein enters 
the Golgi network at the cis face and passes through the Golgi stack, where it is 
further modified; after it is sorted, a plasma membrane-specific vesicle is formed 
at the trans face of the Golgi network. The secretory transport vesicle fuses with the 
plasma membrane and releases the secretory protein to the extracellular environ¬ 
ment. 


Protein secretion is basically the same in all eukaryotic organisms from 
yeast to plant and animal cells. Briefly, the signal sequence of a secretory 
protein is bound by a signal recognition particle during protein synthesis; 
the signal recognition particle attaches to a receptor on the membrane of 
the endoplasmic reticulum, and the secretory protein passes through a 
channel in the membrane as translation proceeds; a signal peptidase 
removes the signal sequence; and the secretory protein is released into the 
lumen of the endoplasmic reticulum, where it is folded and, if required, 
glycosylated. A vesicle containing a processed secretory protein buds off 
from the endoplasmic reticulum and is transported to and fuses with the 
cis face of the Golgi apparatus (Fig. 2.32). Additional processing, glycosyla- 
tions, and posttranslation modifications take place in the Golgi stack. The 
secretory protein then emerges from the trans face of the Golgi apparatus 
enclosed in a vesicle that is transported to and fuses with the plasma mem¬ 
brane, where the contents are released to the external environment (Fig. 
2.32). In eukaryotic organisms, some proteins are secreted continuously 
(constitutive secretion). Others remain in vesicles (mature secretory gran¬ 
ules) near the plasma membrane and are released only after a hormone or 
membrane depolarization signal is received. 
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SUMMARY 


A DNA molecule has two polynucleotide strands that form 
an antiparallel double helix. The monomeric unit of a 
DNA strand is a nucleotide that consists of an base, a deoxyri- 
bose sugar, and a phosphate group. The successive nucle¬ 
otides of a DNA strand are linked by phosphodiester bonds, 
and the two strands of DNA are held together by hydrogen 
bonds between specific complementary pairs of bases. During 
replication, which is mediated by a number of different pro¬ 
teins, including DNA polymerases, each DNA strand acts as a 
template for the production of a complementary strand. 

Proteins are vital for the maintenance of all biological func¬ 
tions. A protein consists of a specific sequence of amino acids 
that are linked by peptide bonds. The sequence of amino acids 
for a protein is encoded in the DNA. The process of decoding 
genetic information is carried out by RNA molecules, including 
mRNA, tRNA, and rRNA; various enzymes; and an assort¬ 
ment of protein factors. All RNA is transcribed from DNA. 
Sequences of DNA in combination with protein factors ensure 
that transcription is initiated at a precise starting point, that 
the appropriate strand is used as the template, and that termi¬ 
nation occurs at a specified nucleotide site. In eukaryotic 
organisms, most structural genes consist of coding regions 
(exons) separated by noncoding segments (introns). Primary 
transcripts contain both exons and introns. However, a pro¬ 
cessing system removes the introns and joins the exons, in the 
proper order, to form a functional mRNA. An mRNA carries 
the code for the sequence of amino acids of a protein. 

Translation of mRNA to produce a protein occurs on ribo¬ 
somes that are composed of a large and a small subunit, each 
containing rRNA and a large number of specific proteins. 
Translation in prokaryotes is initiated by the joining of an 
mRNA with a small ribosomal subunit. As a result of codon- 
to-anticodon complementary base pairing, the initiator tRNA, 
fMet-tRNA fMet , attaches to the mRNA-small ribosomal sub¬ 
unit complex, which then combines with the large ribosomal 
subunit to form an initiation complex. Translation in eukary¬ 
otes is initiated by the combining of a unique initiator tRNA 
which carries the amino acid methionine, Met-tRNA Mel , with a 
small ribosomal subunit and then by the threading of an 
mRNA through the initiator tRNA-small ribosomal subunit 
complex until the first AUG sequence in the mRNA pairs with 
the anticodon of the initiator tRNA. The large ribosomal sub¬ 
unit joins the initiator tRNA-small ribosomal subunit-mRNA 
complex to form an initiation complex that is ready for the 
translation of the mRNA sequence. 

After the formation of the initiation complex, the elonga¬ 
tion phase of translation is very similar in prokaryotes and 
eukaryotes. The next three nucleotides in the mRNA pair with 
the anticodon of a tRNA that carries its specific amino acid. 
The first amino acid, methionine, is cleaved from the initiator 
tRNA and joined to the second amino acid by a peptide bond. 
The "empty" initiator tRNA is ejected from the ribosome, the 
ribosome complex shifts, and the tRNA to which the growing 
peptide is attached occupies the site vacated by the ejected 


initiator tRNA. As a consequence of the shift (translocation), 
the next codon of the mRNA is available to pair with the 
appropriate anticodon of a tRNA that carries its specific amino 
acid that will be joined to the growing peptide. By repeating 
these steps, a polypeptide with a specific sequence of amino 
acids is formed. Translation is terminated when one of three 
stop codons is encountered in the mRNA on a ribosome. A 
termination factor, rather than a tRNA, recognizes the stop 
codon, and the bond between the last tRNA and the com¬ 
pleted amino acid chain is cleaved, causing the tRNA, mRNA, 
and completed protein to be released. 

Only the RNAs and proteins that are essential for main¬ 
taining routine cellular functions are synthesized continu¬ 
ously. To conserve cellular resources, transcription of the 
remaining genes occurs only when a particular protein is 
required and is turned off when the protein is no longer 
needed. In prokaryotes, transcription is initiated by the 
binding of RNA polymerase to the -10 and -35 elements of the 
promoter region of an operon. Regulatory proteins that bind 
to operator sequences in and around the promoter region con¬ 
trol the activity of RNA polymerase at the promoter and 
thereby control transcription initiation. Repressors prevent 
transcription initiation by blocking RNA polymerase binding 
to the promoter or movement along the DNA, while activators 
enhance the binding of RNA polymerase to a promoter 
sequence. The activities of regulatory proteins are controlled 
by small effector molecules that increase or decrease their 
binding to the operator sequence. In eukaryotes, RNA poly¬ 
merase II, which transcribes structural genes, binds to an 
array of proteins called transcription factors that attach, in 
sequence, to a TATA sequence of a promoter region. Other 
transcription factors that bind to DNA elements of eukaryotic 
structural genes are responsible for turning on or turning off 
transcription. The expression of eukaryotic genes is also influ¬ 
enced by the local conformation of chromosomal DNA. 
Regions that are highly compacted by specific DNA-associated 
proteins are generally not transcribed, while more loosely 
packed regions contain genes that are transcriptionally 
active. 

Prokaryotes and eukaryotes have specialized systems for 
exporting proteins across a cytoplasmic membrane. Secreted 
prokaryotic proteins have a sequence of amino acids at their 
N-terminal ends that targets the protein either to the general 
secretory pathway or to more specialized protein complexes 
that transport specific proteins. Eukaryotic proteins that are to 
be secreted are synthesized on ribosomes associated with the 
endoplasmic reticulum and are first secreted into the lumen of 
the endoplasmic reticulum via an N-terminal signal sequence, 
where they are cleaved, folded, and chemically modified. 
After further processing in the Golgi apparatus, the proteins 
are transported to the cytoplasmic membrane in membrane 
vesicles and, following fusion of the vesicle and cytoplasmic 
membranes, are released into the external environment. 
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REVIEW QUESTIONS 


1. Discuss the basic features of DNA replication. 

2. Compare and contrast DNA and RNA. 

3. Describe the differences and similarities between prokary¬ 
otic and eukaryotic structural genes. 

4. Describe the elongation phase of translation. 

5. Deduce the most likely DNA coding sequence for the fol¬ 
lowing human protein: MAGGTWYQLFPRKMWNDSTLHP 
FILPMNVAG. 

6. Determine the amino acid sequence encoded by the fol¬ 
lowing mRNA: GCG AUCG ACG AUGUUUCUAAA AGU AUC 
UCAUCGAAAUGAGGGUUCGUAAUAGCGACC 
CGGGCGG. 


7. What is an operon? What is the biological significance of an 
operon? 

8. How is transcription initiation controlled in bacterial cells? 

9. Describe the major DNA elements that are responsible for 
the transcription of eukaryotic structural genes. 

10. How are proteins transported across the cytoplasmic 
membrane of gram-positive bacterial cells? 

11. Describe the type II secretion system of gram-negative 
bacterial cells. 

12. How are secretory proteins processed in eukaryotic cells? 
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Recombinant DNA 
Technology 


R ecombinant DNA technology, which is also called gene cloning or 
molecular cloning, is a general term that encompasses a number of 
experimental protocols leading to the transfer of genetic information 
(DNA) from one organism to another. There is no single set of methods that 
can be used to meet this objective; however, a recombinant DNA experi¬ 
ment often has the following format (Fig. 3.1). 

• The DNA (cloned DNA, insert DNA, target DNA, or foreign DNA) 
from a donor organism is extracted, enzymatically cleaved (cut, or 
digested), and joined (ligated) to another DNA entity (a cloning 
vector) to form a new, recombined DNA molecule (cloning vector- 
insert DNA construct, or DNA construct). 

• This cloning vector-insert DNA construct is transferred into and 
maintained within a host cell. The introduction of DNA into a bacte¬ 
rial host cell is called transformation. 

• Those host cells that take up the DNA construct (transformed cells) 
are identified and selected (separated, or isolated) from those that 
do not. 

• If required, a DNA construct can be created so that the protein 
product encoded by the cloned DNA sequence is produced in the 
host cell. 

Recombinant DNA technology was developed from discoveries in 
molecular biology, nucleic acid enzymology, and the molecular genetics of 
both bacterial viruses (bacteriophages) and bacterial extrachromosomal 
DNA elements (plasmids). However, recombinant DNA technology would 
not exist without the availability of enzymes that recognize specific double- 
stranded DNA sequences and cleave the DNA in both strands at these 
sequences (restriction enzymes, or restriction endonucleases). Nucleases 
that cut nucleic acid molecules internally are endonucleases, and those that 
degrade from the ends of nucleic acids are exonucleases. 
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CHAPTER 3 





FIGURE 3.1 Recombinant DNA-cloning procedure. DNA from a source organism is 
cleaved with a restriction endonuclease and inserted into a cloning vector. The 
cloning vector-insert (target) DNA construct is introduced into a host cell, and 
those cells that carry the construct are identified and grown. If required, the cloned 
gene can be expressed (transcribed and translated) in the host cell, and the protein 
(recombinant protein) can be harvested. 
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Restriction Endonucleases 

For molecular cloning, both the source DNA that contains the target 
sequence and the cloning vector must be consistently cut into discrete and 
reproducible fragments. It was only after bacterial enzymes that cut DNA 
molecules internally at specific base pair sequences were discovered that 
molecular cloning became feasible. These enzymes are formally designated 
type II restriction endonucleases. Despite the fact that there are other kinds 
of restriction endonucleases (type I, type III, and type IV), the type II 
restriction endonucleases are commonly called restriction endonucleases or 
simply restriction enzymes. 

One of the first type II restriction endonucleases to be characterized 
was from the bacterium Escherichia coli, and it was originally designated 
EcoRI. More recently, it has been proposed that the use of italics for naming 
restriction endonucleases be abandoned. Here, we have implemented this 
recommendation. EcoRI is a homodimeric protein (it is made up of two 
identical proteins) that binds to a DNA region with a specific palindromic 
sequence (recognition site, or binding site). In other words, the sequences 
of nucleotides in the two strands of the binding site are identical when 
either is read in the same polarity, i.e., 5' to 3'. The EcoRI recognition 
sequence consists of 6 base pairs (bp) and is cut between the guanine and 
adenine residues on each strand (Fig. 3.2). EcoRI specifically cleaves the 
intemucleotide bond between the oxygen of the 3' carbon of the sugar of 
one nucleotide and the phosphate group attached to the 5' carbon of the 
sugar of the adjacent nucleotide. The symmetrical staggered cleavage of 
DNA by EcoRI produces two single-stranded, complementary cut ends, 
each with extensions of 4 nucleotides, known as sticky ends. In this case. 


FIGURE 3.2 Symmetrical, staggered cleavage of a short fragment of DNA by the type 
II restriction endonuclease EcoRI. The large arrows show the sites of cleavage in the 
DNA backbone. S, deoxyribose sugar; P, phosphate group; OH, hydroxyl group. 
The EcoRI recognition sequence is highlighted by the dashed line. 
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each single-stranded extension terminates with a 5' phosphate group, and 
the 3' hydroxyl group of the opposite strand is recessed. 

In addition to EcoRI, more than 3,700 type II restriction endonucleases 
with about 250 different recognition sites have been isolated from various 
bacteria. The naming protocol for these enzymes is the same as that for 
EcoRI; the genus is the capitalized letter, and the first two letters of the spe¬ 
cies name are in lowercase letters. The strain designation is occasionally 
added to the name, such as R in EcoRI, or the serotype of the source bacte¬ 
rium is sometimes noted, such as d in Hindlll. The Roman numerals are 
used to designate the order of characterization of different restriction endo¬ 
nucleases from the same organism. For example, Hpal and Hpall are the 
first and second type II restriction endonucleases that were isolated from 
Haemophilus paramfluenzae. 

The palindromic sequences where most type II restriction endonu¬ 
cleases bind and cut a DNA molecule are within the recognition sites. Some 
restriction endonucleases digest (cleave) DNA, leaving 5' phosphate exten¬ 
sions (protruding ends, or sticky ends) with recessed 3' hydroxyl ends; 
some leave 3' hydroxyl extensions with recessed 5' phosphate ends; and 
some cut the backbones of both strands within a recognition site to produce 
blunt-ended (flush-ended) DNA molecules (Fig. 3.3). The lengths of the 
recognition sites for different enzymes can be four, five, six, eight, or more 
nucleotide pairs (Table 3.1). Because of the frequency with which their rec¬ 
ognition sites occur in DNA, restriction endonucleases that cleave within 
sites of four (four-cutters) and six (six-cutters) nucleotide pairs are used for 
most of the common molecular-cloning protocols. The importance of the 
type II restriction endonucleases for gene cloning cannot be overstated. 


FIGURE 3.3 Blunt-end cleavage of a short fragment of DNA by the type II restriction 
endonuclease Hindll. The large arrows show the sites of cleavage in the DNA back¬ 
bone. For abbreviations, see the legend to Fig. 3.2. The Hindll recognition sequence 
is highlighted. 
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TABLE 3.1 Recognition sequences of some restriction endonucleases 


Enzyme 

Recognition site 

Type of cut end 

EcoRI 

G 4- A—A—T—T—C 
C-T-T-A-ATG 

5' phosphate extension 

BamHI 

GJ-G—A—T—C—C 

C—C—T—A—G T G 

5' phosphate extension 

PstI 

C—T—G—C—A J. G 

G T A—C—G—T—C 

3' hydroxyl extension 

Sau3AI 

4-G—A—T—C 
C-T-A-GT 

5' phosphate extension 

PvuII 

C—A—G J- C—T—G 

G—T—C T G—A—C 

Blunt end 

Hpal 

G—T—T J. A—A—C 
C-A-ATT-T-G 

Blunt end 

Haem 

G-GJ.C-C 

C-CTG-G 

Blunt end 

Notl 

GiC-G-G-C-C-G-C 
C-G-C-C-G-G-Ct G 

5' phosphate extension 


Arrows denote cleavage sites. 


When a DNA sample is treated with one of these enzymes, the same set of 
fragments is always produced, assuming that all of the recognition sites are 
cleaved. In addition, ready access to a variety of restriction endonucleases 
adds versatility to gene-cloning strategies. 

Type IIS restriction endonucleases form a subgroup of the type II cat¬ 
egory of restriction enzymes and are occasionally used for cloning and 
other molecular studies, such as multiplex polony sequencing, that are 
discussed in chapter 4. These enzymes have the fascinating feature of cut¬ 
ting DNA, usually in both strands, a fixed number of nucleotides away 
from one end of the recognition site. Moreover, any particular sequence of 
nucleotides may be present between the binding sequence and the cut sites. 
The cleavages for most type IIS restriction enzymes are staggered. For 
example, the FokI restriction endonuclease binds to an ^ cuts 9 nucle¬ 
otides downstream on the upper strand and 13 nucleotides downstream 
on the lower strand, producing a recessed 3' hydroxyl end and a 4-nucle¬ 
otide extension at the 5' phosphate end. One representation of the recogni¬ 
tion sequence and cut sites of the FokI restriction endonuclease is 
cctacnnnnnnnnnnnnn' wh ere N denotes A, C, G, or T. Of course, with this 
single-letter code, it is understood that the nucleotides (N) opposite each 
other are base paired. A simpler notation is ccmc& ar| d perhaps the sim¬ 
plest is ggatg(9/i3). Examples of some type IIS restriction endonucleases are 
shown in Table 3.2. It should be noted that a few type IIS restriction endo¬ 
nucleases cleave DNA both upstream and downstream from their recogni¬ 
tion sites. 

Under natural conditions, bacteria use restriction endonucleases to 
cleave foreign DNA, such as that of infecting bacterial viruses (bacterio¬ 
phages), and have developed systems that protect their own DNA from 
being degraded. Most often, methylation of the cytosine residues of a 
restriction endonuclease site in the host DNA prevents restriction endonu¬ 
cleases from cutting at these sites, but the nonmethylated sites of foreign 
DNA are vulnerable to attack. With the characterization of large numbers 
of restriction endonucleases from various bacteria, interesting relationships 
have been observed. In some instances, different phosphodiester bonds 
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TABLE 3.2 Some examples of type IIS restriction endonucleases 


Restriction endonuclease 

Recognition sequence 

Acul 

5' 

CTGAAG(N) 16 


3' 

GACTTC(N) 14 

BfuAI 

5' 

ACCTGC(N) 4 


3' 

TGGACG(N) s 

BsmBI 

5' 

CGTCTC(N) ! 


3' 

GCAGAG(N) 5 

Ecil 

5' 

GGCGGA (N) 41 


3' 

CCGCCT(N) 9 

FokI 

5' 

GGATG(N) 9 


3' 

CCTAC(N) 13 

Hgal 

5' 

GACGC(N) 5 


3' 

CTGCG (N) 10 

Mlyl 

5' 

GAGTC(N) 5 


3' 

CTCAG(N) 5 

Mmel 

5' 

TCCRAC(N) 20 


3' 

AGGYTG(N) 18 


Recognition sequences are shown with locations of downstream cleavage sites. N - A, T, G, or C; R - A 
or G; Y - T or C. The single-letter codes y represent the base pair or [i. and y represents y, y, or 


FIGURE 3.4 Neoschizomers. Four restric¬ 
tion endonucleases bind to the same 
recognition site and cleave at different 
positions. The restriction endonucleases 
and cleavage sites (arrows) are color 
coded: KasI, red; Narl, blue; Sfol, black; 
Bbel, green. A number of other restric¬ 
tion endonucleases, such as Ndal, 
Mlyll3I, Mchl, BinSII, DinI, Egel, and 
Ehel, that bind to and cleave this 
sequence are not shown. 
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within the same recognition site are cleaved by restriction endonucleases 
from different organisms. For example, the restriction enzymes Xmal and 
Smal both recognize the sequence gggccc' but Xmal cleaves after the first 5' 
cytosine in each strand and produces 5' phosphate extensions whereas 
Smal generates blunt ends by cleaving between the gS base pair in the 
middle of the recognition site. The first restriction endonuclease that is 
discovered to bind to a particular recognition site is designated the proto¬ 
type. Any additional restriction endonucleases that attack the same 
sequence as the prototype are called isoschizomers. For example, the 
restriction endonucleases Xhol and PaeR7I from different organisms both 
have the same recognition sequences and cleavage locations. Isoschizomers 
that cleave at different positions within the same recognition site are neo¬ 
schizomers (Fig. 3.4). On the other hand, restriction endonucleases that 
produce the same nucleotide extensions but have different recognition sites 
are designated isocaudomers, e.g., BamFII and Sau3AI (Table 3.1). In some 
cases, a restriction endonuclease will cleave a sequence only if the cytosines 
of the recognition site are not methylated whereas another restriction endo¬ 
nuclease will cut the same sequence if these cytosines are methylated. For 
example, Flpall cleaves only nonmethylated gggS sites, and MspI, an iso- 
schizomer of Flpall, cuts this sequence regardless of cytosine methylation. 
This pair of restriction endonucleases is often used to determine the meth¬ 
ylation status of genomic DNA. If a DNA molecule is not cut by Hpall but 
is cut by MspI, then the recognition site is methylated. If both restriction 
endonucleases cleave a DNA molecule, then the site(s) is not methylated. 

Physical maps that designate the relative positions of restriction endo¬ 
nuclease sites on a specific piece of DNA can be constructed by treating the 
DNA molecule singly with different restriction endonucleases and then 
with combinations of the same restriction endonucleases. The positions of 
the cleavage sites can be deduced from an analysis of fragment sizes, 
which are determined by agarose gel electrophoresis (Box 3.1). By way of 
illustration, the fragment sizes produced by various digestions are shown 
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in Fig. 3.5A. It can be deduced that this linear piece of DNA has two 
BamHI sites and two EcoRI sites in a definite order with a specified 
number of base pairs separating the sites. More specifically, the sizes of 
fragments produced in each single digestion can be compared with those 
from double digestions to determine the positions of the restriction endo¬ 
nuclease sites and to generate a restriction endonuclease site map (restric¬ 
tion endonuclease map). In the example shown in Fig. 3.5, the analysis 
goes as follows. Because each single digestion produces three fragments 
from a linear DNA molecule, the original piece of DNA must contain two 
sites for each of the restriction endonucleases. The 3,000-bp fragment that 


BOX 3.1 


Gel Electrophoresis 

G el electrophoresis is a commonly 
used technique for resolving pro¬ 
teins or nucleic acids. In general, a 
sample of one particular type of mac¬ 
romolecule (protein, DNA, or RNA) is 
placed in a well at or near the end of a 
gel matrix (gel). The composition of an 
electrophoresis gel is a semisolid open 
meshwork of interlinked linear 
strands. A gel is cast as a thin slab 
with a number of sample wells. After 
the wells of a gel are loaded with 
sample, an electric field is applied 
across the gel, and charged macromol¬ 
ecules of the same size are driven 
together in the direction of the anode 
through the gel as discrete invisible 
bands of material. The distance that a 
band moves into a gel depends on the 
mass of its macromolecules and the 
size of the openings (pore size) of the 
gel. The smaller macromolecules 
travel further than the larger ones. 

The progress of gel electrophoresis 
is monitored by observing the migra¬ 
tion of a visible dye (tracking dye) 
through the gel. The tracking dye is a 
charged, low-molecular-weight com¬ 
pound that is loaded into each sample 
well at the start of a run. When the 
tracking dye reaches the end of the 
gel, the run is terminated. The bands, 
which are aligned in a lane under each 
well, are visualized by staining the gel 
with a dye that is specific for protein, 
DNA, or RNA. Discrete bands are 
observed when there is enough mate¬ 
rial present in a band to bind the dye 
to make the band visible and when 
the individual macromolecules of a 


sample have distinctly different sizes. 
Otherwise, a band is not detected. If 
there is little or no difference among 
the sizes of the macromolecules in a 
concentrated sample, a smear of 
stained material is observed. The 
intensity of a stained band reflects the 
frequency of occurrence of a macro¬ 
molecule in a sample. 

The molecular mass (molecular 
weight) of a gel-fractionated macro¬ 
molecule (band) is determined from a 
standard curve that is based on a set 
of macromolecules of known molec¬ 
ular mass (size markers) that covers 
the separation range of the gel system 
and is run in one or both of the out¬ 
side lanes (calibrator lanes) of the 
same gel as the samples. The loga¬ 
rithm of the molecular mass of a size 
marker is related to its relative 
mobility (R f ) through a gel. The value 
of Rf is defined as the distance trav¬ 
eled by a band divided by the distance 
traveled by the tracking dye (ion 
front). The relationship between the 
logarithm of the molecular mass of 
each size marker and its Rf value is 
plotted. Then, with this standard 
curve, a molecular mass is calculated 
for each band in a lane. The units of 
molecular mass for proteins and 
double-stranded and single-stranded 
nucleic acids are daltons, base pairs, 
and bases, respectively. The size 
markers are included in the same gel 
as the samples because the extent of 
mobility of a macromolecule(s) varies 
from one electrophoretic run to the 
next. 

Polyacrylamide is the preferred gel 
system for separating proteins. 


Copolymerization of monomeric 
acrylamide and the cross-linker 
bisacrylamide forms a lattice of cross- 
linked, linear polyacrylamide strands. 
The pore size of a polyacrylamide gel 
is determined by the concentration of 
acrylamide and the ratio of acryl¬ 
amide to bisacrylamide. For many 
applications, a protein sample is 
treated with the anionic detergent 
sodium dodecyl sulfate (SDS) before 
electrophoresis. The SDS binds to pro¬ 
teins and dissociates most multichain 
proteins. Each SDS-coated protein 
chain has a similar charge-to-mass 
ratio. Consequently, during electro¬ 
phoresis, the separation of the SDS- 
protein chains is based primarily on 
size, and the effect of conformation is 
eliminated. SDS-polyacrylamide gel 
electrophoresis with a 10% polyacryl¬ 
amide gel resolves proteins that range 
from 20 to 200 kilodaltons (kDa). 

Agarose, which is a polysaccharide 
from seaweed, is used routinely as the 
gel matrix for the electrophoretic sepa¬ 
ration of medium-size nucleic acid 
molecules. A 1.0% agarose gel can 
resolve duplex DNA chains that 
range from about 600 to 10,000 bp. 
Specialized agarose gel electrophoresis 
systems are available for fractionating 
DNA molecules with millions of base 
pairs, denatured DNA, and denatured 
RNA. In addition, for specific pur¬ 
poses, polyacrylamide gels are used 
for separating DNA molecules. For 
example, DNA chains that are as small 
as 6 bases and that differ from each 
other by 1 nucleotide can be resolved 
with a 20% polyacrylamide gel. 
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FIGURE 3.5 Mapping of restriction endonuclease sites. (A) Restriction endonuclease 
digestions and electrophoretic separation of fragments. A purified, linear piece of 
DNA is cut with EcoRI and BamHI separately (single digestions) and then with 
both enzymes together (double digestion). The horizontal lines under the digestion 
conditions represent schematically the locations of the DNA fragments (bands) in 
the lanes of the gel after electrophoresis and staining of the DNA with ethidium 
bromide. The numbers denote the lengths of the digestion products (fragments) in 
base pairs. (B) Restriction endonuclease map derived from the digestions and elec¬ 
trophoretic separation shown in panel A. 


is produced by the single EcoRI digestion remains intact after the double 
digestion, whereas the 8,500- and 5,000-bp EcoRI fragments are cleaved. 
Therefore, the two EcoRI sites are 3,000 bp apart with no intervening 
BamHI site, and there is a single BamHI site within each of the 8,500- and 
5,000-bp EcoRI fragments. The 9,500-bp fragment that is produced by the 
single BamHI digestion is cleaved by EcoRI in the double digestion into 
three pieces (2,500 + 3,000 + 4,000 = 9,500 bp). Therefore, the two BamHI 
sites lie 2,500 and 4,000 bp to either side of the EcoRI sites. Digestion with 
BamHI cleaves the 8,500-bp EcoRI fragment into 2,500- and 6,000-bp frag¬ 
ments, and one of the EcoRI sites is 2,500 bp from a BamHI site, so the 
6,000-bp region must include one of the ends of the original molecule. 
Using the same logic, we also note that digestion with BamHI cuts the 
5,000-bp EcoRI fragment into 1,000- and 4,000-bp fragments and that one 
of the EcoRI sites is 4,000 bp from a BamHI site; therefore, the 1,000-bp 
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TABLE 3.3 DNA fragment sizes (in kilobase pairs) after single and double restriction endonuclease digestions of a 
plasmid 


EcoRI 

BamHI 

Hindlll 

Haell 

EcoRI + 
Haell 

BamHI + 
Haell 

Hindlll + 
Haell 

EcoRI + 
Hindlll 

EcoRI + 
BamHI 

BamHI + 
Hindlll 

12.0 

12.0 

12.0 

6.0 

5.0 

6.0 

6.0 

6.5 

10.5 

8.0 




4.0 

4.0 

4.0 

2.5 

5.5 

1.5 

4.0 




2.0 

2.0 

1.5 

2.0 








1.0 

0.5 

1.5 





region must include the other end of the original molecule. In the final 
map (Fig. 3.5B), the assigned locations of the restriction endonuclease sites 
are consistent with the fragment lengths that were observed in each of the 
digestion reactions. 

The process of formulating a restriction endonuclease map for circular 
DNA is, in principle, the same as with linear DNA, except that each 
cleavage produces a fragment. In other words, three pieces are formed 
when three sites are cut, and so on. With the data in Table 3.3, the deduction 
of a restriction endonuclease map of a circular plasmid is as follows. The 
source DNA is a 12-kilobase-pair (kb) circle with single EcoRI, BamHI, and 
Flindlll sites and three Flaell sites. Digestion with EcoRI, BamHI, or Hindlll 
separately produces a single 12-kb fragment, while digestion with Haell 
produces three fragments. The results of the EcoRI and Haell double diges¬ 
tion indicate that the EcoRI site lies within the 6.0-kb Haell fragment, 
because the 2.0-kb and 4.0-kb Haell fragments remain intact and the sum 
of the two new pieces (5.0 kb + 1.0 kb) is 6.0 kb. Based on the BamHI and 
Haell double digestion, the BamHI site lies within the 2.0-kb Haell region. 
The BamHI and EcoRI double digestion places these sites 1.5 kb apart; 
therefore, the 6.0-kb and 2.0-kb Haell fragments are adjacent. The data do 
not support any other positions for the BamHI and EcoRI sites. The order 
of the Haell segments around the circular molecule is 6.0 kb-4.0 kb-2.0 kb. 
The same reasoning localizes the Hindlll site to the 4.0-kb Haell fragment, 
and the results from the BamHI and Hindlll and/or EcoRI and Hindlll 
double digestions complete the restriction endonuclease map (Fig. 3.6). 

For some restriction endonuclease mapping experiments, the sum of 
the fragments of some multiple digestions is less than the total length of the 
starting DNA because the fortuitous locations of some sites produce frag¬ 
ments of the same size. Under these conditions, two different fragments 
with the same length that migrate to the same location in a gel after electro¬ 
phoresis often stain more heavily than a band with only one kind of frag¬ 
ment. This difference in staining intensity gives an indication that 
coincidental fragments have been produced by restriction endonuclease 
digestion. Generally, computer programs are used to configure restriction 
endonuclease maps for large DNA molecules with many single and mul¬ 
tiple digestions. Also, for very large DNA molecules, specialized electro¬ 
phoresis systems are used to separate the large number of restriction 
endonuclease digestion products. 

The resolution of fragments for restriction endonuclease mapping can 
be enhanced by labeling the pieces of DNA, usually at the 5' ends, with a 
radioactive compound or fluorescent dye and determining their lengths 
after electrophoretic separation with autoradiography or fluorography, 
respectively. A common 5'-end-labeling procedure entails dephosphoryla¬ 
tion of the 5' ends of a linear DNA molecule with calf intestine alkaline 


FIGURE 3.6 Restriction endonuclease 
map for the digestion products pre¬ 
sented in Table 3.3. The circular DNA 
has 12,000 bp (12 kb). With one of the 
Haell sites arbitrarily placed at posi¬ 
tion 0/12.0 kb, the locations of the 
other mapped restriction endonuclease 
sites are marked. The nucleotide posi¬ 
tions of the sites are in parentheses. 

Haell 
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phosphatase and the addition of radiolabeled y-phosphate from adenosine 
triphosphate (ATP) to the 5' OH ends by T4 polynucleotide kinase. The 
labeled DNA fragments are separated from unincorporated label by 
column chromatography before gel electrophoresis. Parenthetically, recom¬ 
binant DNA technology requires many different enzymes with various 
activities. Some of these are listed in Table 3.4. 

Restriction endonuclease cleavage is important in molecular cloning for 
inserting target DNA into a cloning vector. When two different DNA sam¬ 
ples are digested with the same restriction endonuclease that produces a 
staggered cut, i.e., the same 5' or 3' extension or sticky end, and then mixed 
together, new DNA combinations can be formed as a result of base pairing 
between the extension (overhang) regions (Fig. 3.7). However, restriction 
enzymes alone are not sufficient for molecular cloning. First, when the 
extended ends that are created by restriction enzyme (e.g., BamHI) cleavage 
are aligned, the hydrogen bonds of the four bases that pair are not strong 
enough to keep two DNA molecules together. A means of re-forming the 
intemucleotide linkage between the 3' hydroxyl group and the 5' phosphate 
group in the backbone at the two broken bond sites (nicks) is required. This 
problem is resolved by using the enzyme DNA ligase, usually from bacte¬ 
riophage T4. This enzyme catalyzes the formation of phosphodiester bonds 
at the ends of DNA strands that are already held together by the base 
pairing of two extensions. DNA ligase also joins blunt ends that come in 
contact when they both bind to the enzyme (Fig. 3.8). The reaction condi¬ 
tions for DNA ligations depend on whether the DNA molecules have exten¬ 
sions or blunt ends. With protruding ends, the reaction is often carried out 
at low temperatures for long periods to ensure that the extensions remain 
base paired. Blunt-end ligations require 10 to 100 times more T4 DNA ligase 


TABLE 3.4 Some of the enzymes used for recombinant DNA technology 


Enzyme 


Activity 


Alkaline phosphatase 
DNase I 

E. coli exonuclease III 


Removes 5' phosphate groups of DNA molecules; bacterial alkaline phosphatase is more stable 
but less active than calf intestinal alkaline phosphatase 
Degrades double-stranded DNA by hydrolyzing internal phosphodiester linkages 
Sequentially removes nucleotides from 3' OH ends of DNA molecules, except from protruding 
3' OH termini 


Klenow fragment Proteolytic product of E. coli DNA polymerase I that has both polymerase and 3' exonuclease 

activities and no 5' exonuclease activity because fractionation of the digestion products 
removes the fragment with the 5' exonuclease activity; a Klenow fragment with only DNA 
polymerase activity due to a mutation in the 3' exonuclease sequence is also available 


Mung bean nuclease 
Nuclease BAL 31 
Poly(A) polymerase 
Reverse transcriptase 
RNase H 
SI nuclease 

T4 polynucleotide kinase 

T4 DNA polymerase 
T7 DNA polymerase 
Taq DNA polymerase 
p-Agarase I 


Single-stranded DNA and RNA endonuclease 

Degrades both 3' and 5' ends of DNA without internal cleavages 

Adds AMP from ATP to the 3' end of mRNA 

Retroviral RNA-directed DNA polymerase 

Degrades the RNA strand of a DNA-RNA hybrid molecule 

Degrades single-stranded DNA 

Catalyzes the transfer of the terminal (y) phosphate from a nucleoside 5' triphosphate to a 5' 
hydroxyl group of a polynucleotide 
DNA polymerase and 3' exonuclease activities 
DNA polymerase and 3' exonuclease activities 
Heat-stable DNA polymerase from Thermus aquations 

Digests agarose; is used to retrieve separated DNA molecules from agarose gels 
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BamHI recognition site 

T-V 

- ATGTT'GGATCC|TTGAC — 

■ tacaaIcctaggaactg — 

1 — -t- J 


BamHI recognition site 

d-V 

— CTAT'GGATCC|TAGGCC- 

— G ataIcc T AGG'ATCCGG- 


Cleave 


Cleave 


OH P 

— ATGTTG / ^GATCCTTGAC — 


OH 

— CTATG / 



\ / \ / 

— ATGTT GGATCCTTGAC— — CTATGGATCCTTGAC — 


— TACAACCTAGGAACTG— —GATACCTAGGAACTG — 

/ \ / \ 

P OH P OH 


Nick 


Nick 


^GATCCTAGGCC — 
.GATCCGG — 

HCT 


FIGURE 3.7 Annealing complementary extensions after staggered cleavage with a 
type II restriction endonuclease. Two different DNA fragments are cut with the 
restriction endonuclease BamHI, mixed, and annealed. Not all of the possible com¬ 
binations of annealed DNAs are shown. The four fragments that are generated by 
the BamHI digestion can anneal to one another to form any of six different DNA 
molecules. A break in the phosphodiester bond in one strand of duplex DNA is 
called a nick. The hydrogen bonds of the four base pairs between nicks on opposite 
strands are not strong enough to hold DNA molecules together for long periods in 
solution. A, C, G, and T represent nucleotides. 


than do ligations of DNA molecules with extensions and are conducted at 
room temperature because stable base pairing is not required. 

Second, the ability to join different DNA molecules is not by itself useful 
unless the new DNA combination (i.e., recombinant DNA) can be perpetu¬ 
ated in a host cell. Thus, the ligated construct must contain the biological 
information for cellular maintenance. This requirement is usually provided 
on cloning vectors that were developed to overcome this problem. 

Third, digestion of the source DNA containing the gene of interest with 
a restriction endonuclease produces a mixture of DNA molecules, and a 
number of different DNA constructs are formed after ligation with a 
cloning vector. Consequently, there has to be a way of identifying the DNA 
combination in a host cell that contains the target DNA sequence. Screening 
procedures have been devised to detect host cells carrying a specific 
cloning vector-DNA insert construct. 


Plasmid Cloning Vectors 

Plasmids are self-replicating, double-stranded, circular DNA molecules that 
are maintained in bacteria as independent extrachromosomal entities. 
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FIGURE 3.8 Mode of action of T4 DNA ligase. The enzyme T4 DNA ligase forms 
phosphodiester bonds by joining 5 ' phosphate and 3 ' hydroxyl groups at nicks in 
the backbone of double-stranded DNA. (A) Ligation of sticky-ended DNA; (B) liga¬ 
tion of blunt-ended DNA. A, C, G, and T represent nucleotides. 


Virtually all bacterial genera have natural plasmids. Some plasmids carry 
information for their own transfer from one cell to another (e.g., F plasmids), 
others encode resistance to antibiotics (R plasmids), others carry specific 
sets of genes for the utilization of unusual metabolites (degradative plas¬ 
mids), and some have no apparent functional coding genes (cryptic plas¬ 
mids). Although they are not typically essential for bacterial cell survival 
under laboratory conditions, plasmids often carry genes that are advanta¬ 
geous under particular conditions. Plasmids can range in size from less than 
1 kb to more than 500 kb. Each plasmid has a sequence that functions as an 
origin of DNA replication; without this site, it cannot replicate in a host 
cell. 

Some plasmids are represented by 10 to 100 copies per host cell; these 
are called high-copy-number plasmids. Others maintain one to four copies 
per cell and are called low-copy-number plasmids. Seldom does the popu¬ 
lation of plasmids in a bacterium make up more than approximately 0.1 to 
5.0% of the total DNA. When two or more different plasmids cannot coexist 
in the same host cell, they are said to belong to the same incompatibility 
group, but plasmids from different incompatibility groups can be main¬ 
tained together in the same cell. This coexistence is independent of the copy 
numbers of the individual plasmids. Some microorganisms have been 
found to contain as many as 8 to 10 different plasmids. In these instances, 
each plasmid can carry out different functions and have its own unique 
copy number, and each belongs to a different incompatibility group. Some 
plasmids, because of the specificity of their origin of replication, can repli¬ 
cate in only one species of host cell. Other plasmids have less specific ori- 
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gins of replication and can replicate in a number of bacterial species. These 
plasmids are called narrow- and broad-host-range plasmids, respectively. 

As autonomous, self-replicating genetic elements, plasmids have the 
basic attributes to make them potential vectors for carrying cloned DNA. 
However, naturally occurring (unmodified, or nonengineered) plasmids 
often lack several important features that are required for a high-quality 
cloning vector. The more important features are (1) a choice of unique 
(single) restriction endonuclease recognition sites into which the insert 
DNA can be cloned and (2) one or more selectable genetic markers for iden¬ 
tifying recipient cells that carry the cloning vector-insert DNA construct. In 
other words, plasmid cloning vectors have to be genetically engineered. 

Plasmid Cloning Vector pBR322 

In the 1980s, one of the best-studied and most often used "general-pur¬ 
pose" plasmid cloning vectors was pBR322. In general, plasmid cloning 
vectors are designated by a lowercase p, which stands for plasmid, and 
some abbreviation that may be descriptive or, as is the case with pBR322, 
anecdotal. The "BR" of pBR322 recognizes the work of the researchers E 
Bolivar and R. Rodriguez, who created the plasmid, and 322 is a numerical 
designation that has relevance to these workers. Plasmid pBR322 contains 
4,361 bp. As shown in Fig. 3.9, pBR322 carries two antibiotic resistance 
genes. One confers resistance to ampicillin (Amp r ), and the other confers 
resistance to tetracycline (Tet r ). This plasmid also has unique BamHI, 
Hindlll, and Sail recognition sites within the Tet r gene; a unique PstI site in 
the Amp r gene; a unique EcoRI site that is not within any coding DNA; and 
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Cleavage of DNA by RI Restriction Endonuclease 
Generates Cohesive Ends 
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R ecombinant DNA technology 
requires a vector to carry 
cloned DNA, the specific 
joining of vector and cloned (insert) 
DNA molecules to form a vector- 
insert DNA construct, the introduction 
of the vector-insert DNA construct 
into a host cell, and the identification 
of host cells that acquired the cloned 
DNA. Without type II restriction 
endonucleases, it would be impossible 
to do recombinant DNA technology 
routinely. These enzymes facilitate the 
development of vectors (see, e.g., 
Bolivar et al.. Gene 2:95-113, 1977) and 
are essential for cloning genes into 
vectors. In 1968, M. Meselson and R. 


Yuan (Nature 217:1110-1114) showed 
that the capability of a strain of E. coli 
to prevent (restrict) the development 
of a bacterial virus (bacteriophage) 
was due to a host cell enzyme that 
cleaved the DNA of the infecting bac¬ 
teriophage. The study done by Mertz 
and Davis established that the RI 
restriction endonuclease from E. coli, 
which is now called EcoRI, cut DNA 
at a specific site and produced com¬ 
plementary extensions. Briefly, they 
showed that after circular DNA was 
linearized by treatment with EcoRI, 
some of the molecules formed 
hydrogen-bonded circular DNA mole¬ 
cules, which were converted to cova¬ 


lently closed circular DNA molecules 
by treating the sample with a DNA 
ligase. The extensions of all of the cut 
DNA molecules were the same and 
were estimated to be 4 to 6 nucleotides 
long, with the recognition site being 
six nucleotide pairs. Mertz and Davis 
concluded that "any two DNA mole¬ 
cules with RI sites can be 'recombined' 
at their restriction sites by the sequen¬ 
tial action of RI endonuclease and 
DNA ligase to generate hybrid DNA 
molecules." The discovery that EcoRI 
created cohesive ends was one of the 
most important contributions to the 
development of recombinant DNA 
technology because it provided, 
according to Mertz and Davis, a 
"simple way...to generate specifically 
oriented recombinant DNA molecules 
in vitro." 
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FIGURE 3.9 Plasmid pBR322. (A) Genetic 
map of the plasmid cloning vector 
pBR322. Unique Hindlll, Sail, BamHI, 
and PstI recognition sites are present in 
the Amp 1 and Tet r genes. The unique 
EcoRI site is just outside the Tet r gene. 
The origin of replication functions in 
the bacterium E. coli. The complete 
DNA sequence of pBR322 consists of 
4,361 bp. (B) Electron micrograph of 
plasmid pBR322. Magnification, 
xl00,000. Source: K. G. Murti. © Visuals 
Unlimited. 


an origin of DNA replication that functions only in E. coli, is maintained at 
a high copy number in E. coli, and cannot be readily transferred to other 
bacteria. 

How does pBR322 work as a cloning vector? Purified, closed circular 
pBR322 molecules are cut with a restriction enzyme that lies within either 
of the antibiotic resistance genes and cleaves the plasmid DNA only once 
to create single, linear, sticky-ended DNA molecules. These linear mole¬ 
cules are combined with prepared target DNA from a source organism. 
This DNA has been cut with the same restriction enzyme, which generates 
the same sticky ends as those on the plasmid DNA. The DNA mixture is 
then treated with T4 DNA ligase in the presence of ATP. Under these condi¬ 
tions, a number of different ligated combinations are produced, including 
the original closed circular plasmid DNA. To reduce the amount of this 
particular unwanted ligation product, the cleaved plasmid DNA prepara¬ 
tion is treated with the enzyme alkaline phosphatase to remove the 5' 
phosphate groups from the linearized plasmid DNA. As a consequence, T4 
DNA ligase cannot join the ends of the dephosphorylated linear plasmid 
DNA (Fig. 3.10). However, the two phosphodiester bonds that are formed 
by T4 DNA ligase after the ligation and circularization of alkaline phos¬ 
phatase-treated plasmid DNA with restriction endonuclease-digested 
source DNA, which provides the phosphate groups, are sufficient to hold 
the two molecules together, despite the presence of two nicks (Fig. 3.10). 
After transformation, these nicks are sealed by the host cell DNA ligase 
system. Digested fragments from the source DNA are also joined to each 
other by T4 DNA ligase. However, these unwanted ligation products do 
not contain an origin of replication and therefore will not replicate fol¬ 
lowing introduction into a host cell. 

Transformation and Selection 

The next step in a recombinant DNA experiment requires the uptake of the 
cloned plasmid DNA by a bacterial cell, usually £. coli. The process of intro¬ 
ducing purified DNA into a bacterial cell is called transformation, and a cell 
that is capable of taking up DNA is said to be competent. Competence 
occurs naturally in many bacteria. In different bacterial species, usually 
when cell density is high or starvation is impending, a set of proteins is 
produced that facilitates the uptake of DNA molecules. This phenomenon 
allows genes to be transferred between different bacteria. A natural transfor¬ 
mation process often entails (1) the binding of double-stranded DNA to 
components of the cell wall; (2) entry of the DNA into an inner compartment 
(periplasm), where it is protected from enzymes that degrade nucleic acids 
(nucleases); (3) transmission of one strand into the cytoplasm while the 
other one is degraded; and (4) if the DNA is a linear molecule, integration 
into the host chromosome. If the introduced DNA is a plasmid, it is main¬ 
tained in the cytoplasm after the second strand is synthesized. Competence 
and transformation are not intrinsic properties of E. coli. However, compe¬ 
tence can be induced in £. coli by various special treatments, such as cold 
calcium chloride, which in turn enhance the acquisition of DNA by the cell. 
A brief heat shock facilitates the uptake of exogenous DNA molecules. 

Two parameters—transformation frequency and transformation effi¬ 
ciency—are used to assess the success of DNA transformation. The transfor¬ 
mation frequency is the ratio of transformed cells to the total number of 








Recombinant DNA Technology 


61 


Plasmid DNA 


Target DNA 


f Restriction endonuclease 
cleavage 

^ Alkaline phosphatase 
^ ( treatment 

HO - OH 

HO - OH 


^ Restriction endonuclease 
cleavage 


P AAAAAAAAAA qj-[ 
pjq AAAAAAAAAA p 



“O II Phosphodiester 
P bond 

/ \ 



FIGURE 3.10 Cloning foreign DNA into a plasmid vector. After restriction endonu¬ 
clease cleavage and alkaline phosphatase treatment, the plasmid DNA is ligated to 
the restriction endonuclease-digested target DNA, and two of the four nicks are 
sealed. This molecular configuration is stable, and the two DNA molecules are 
covalently joined. After introduction into a host cell, ensuing replication cycles 
produce new complete circular DNA molecules with no nicks. 


treated cells. The transformation efficiency is the number of transformed 
cells as a function of the amount of DNA that was originally added to the 
cells. Generally, transformation is an inefficient process, with typically no 
more than 1 cell in 1,000 being transformed. After transformation, most of the 
cells have not acquired a new plasmid. Furthermore, a few cells are trans¬ 
formed by recircularized plasmid DNA that escaped dephosphorylation by 
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alkaline phosphatase, others acquire ligated and nonligated nonplasmid 
DNA, and a few are transformed by the plasmid-insert DNA construct. 

As noted earlier, extrachromosomal DNA that lacks an origin of repli¬ 
cation cannot be maintained within a bacterial cell. Thus, the uptake of 
nonplasmid DNA is usually of no consequence in a recombinant DNA 
experiment. To ensure that a plasmid-cloned DNA construct is perpetu¬ 
ated in its original form, the E. coli host cells should have certain features. 
For example, the absence of restriction endonucleases ensures that DNA 
constructs will not be degraded after transformation. In addition, the integ¬ 
rity of DNA constructs is more likely to be maintained in host cells that are 
unable to carry out exchanges between DNA molecules because the host 
cells are recombination negative (RecA ). Also, cells that do not produce 
the endonuclease encoded by the endAl gene have increased transforma¬ 
tion frequencies. 

After the transformation step, it is necessary to identify, as easily as 
possible, those cells that contain plasmids with cloned DNA. In a pBR322 
system in which the target DNA was inserted into the BamHI site, this 
specific identification is accomplished using the two antibiotic resistance 
markers that are carried on the plasmid. Following transformation, the 
cells are incubated in medium without antibiotics to allow the antibiotic 
resistance genes to be expressed, and then the transformation mixture is 
plated onto medium that contains the antibiotic ampicillin. Cells that carry 
pBR322 with or without insert DNA can grow under these conditions 
because the Amp r gene on pBR322 is intact. The nontransformed cells are 
sensitive to ampicillin. 

The BamHI site of pBR322 is within the Tet r gene (Fig. 3.9), so the inser¬ 
tion of DNA into this gene disrupts the coding sequence and tetracycline 
resistance is lost. Therefore, cells with these plasmid-cloned DNA con¬ 
structs are resistant to ampicillin and sensitive to tetracycline. Cells with 
recircularized pBR322 DNA, however, have an intact Tet r gene and are 
resistant to both ampicillin and tetracycline. The second step in the selec¬ 
tion scheme distinguishes between these two possibilities. Cells that grow 
on the ampicillin-containing medium are transferred to a tetracycline-con¬ 
taining medium. The relative positions of the cells transferred to the tetra¬ 
cycline-agar plate are the same as those of the colonies from which they 
were transferred on the original ampicillin-agar plate. Cells that form colo¬ 
nies on the tetracycline-agar plates carry recircularized pBR322 without 
insert DNA, because as noted above, these cells are resistant to both ampi¬ 
cillin and tetracycline. Those cells that do not grow on the tetracycline-agar 
plates, however, are sensitive to tetracycline and carry pBR322-cloned 
DNA constructs (Fig. 3.11). 

Individual cultures that are sensitive to tetracycline are established 
from each of the colonies on the ampicillin-agar plates. Later, additional 
screening procedures can be conducted to verify that these cells, called 
transformants, carry the desired pBR322-cloned DNA construct. The 
Hindlll and Sail sites in the tetracycline resistance gene and the PstI site in 
the ampicillin resistance gene of pBR322 provide alternative potential 
cloning locations. When the PstI recognition site is used for cloning, the 
principle of the selection scheme is the same but the antibiotic sensitivities 
are reversed; thus, the first set of plates contains tetracycline and the second 
set contains ampicillin. 

The pBR322 selection scheme for identifying transformed cells with 
insert DNA-vector constructs relies on replica plating. This technique. 
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which can be used in many different ways for various purposes, was 
originally devised to isolate mutant bacterial colonies that require a supple¬ 
ment for growth, i.e., auxotrophic mutants (Fig. 3.12). 

Other Plasmid Cloning Vectors 

The plasmid pBR322 was a well-conceived cloning vector. However, it has 
only a few unique cloning sites, and the selection procedure is time-con¬ 
suming. Thus, inevitably, other systems were developed. For example, the 


FIGURE 3.11 Strategy for selecting host cells that have been transformed with pBR322. 
(1) The transformation mixture, which contains three cell types, viz., nontrans- 
formed cells, cells with the intact original plasmid, and cells with DNA cloned into 
the BamHI site of pBR322, is plated on complete medium with ampicillin. (2) The 
mixture is diluted beforehand to ensure separate colonies are formed on the agar. 
The nontransformed cells (Amp s ) are killed. The cells with the intact plasmid and 
cloned DNA-plasmid constructs are Amp 1 and therefore form colonies. Samples of 
the surviving colonies on the ampicillin plate are transferred to a plate with com¬ 
plete medium and tetracycline, keeping the same position of each colony on the 
second plate, i.e., replica plating. Only cells with intact plasmids (Tet r ) will form 
colonies in the presence of tetracycline. (3) The colonies that did not grow on the 
tetracycline plate (dashed circles) but grew on the ampicillin plate carry pBR322 
with DNA that was cloned into the BamHI site. The colonies with cloned DNA 
inserts are picked from the original plate, pooled, and grown. The red square rep¬ 
resents an orientation marker that keeps the master and replica plates aligned. 
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FIGURE 3.12 Screening bacterial colonies for mutant strains by replica plating. (A) 
Replica-plating (colony transfer) device; (B) replica-plating technique. Cells from 
each separated colony on a master plate (1) adhere to the velveteen of the replica¬ 
plating device after it is gently pressed against the agar surface (2). The adhering 
cells are transferred (3), in succession, to a petri plate with complete medium (4) 
and to one with selective medium (5). The pattern of the colonies is consistent 
among the replicated plates because the orientation markers (red squares) are 
aligned for each transfer. In this example, minimal medium is the selective medium 
used to identify colonies that require a nutritional supplement for growth, i.e., aux¬ 
otrophic mutants. The missing colony (dashed circle) on the minimal medium (5) 
denotes an auxotrophic mutation. The equivalent location on the plate with com¬ 
plete medium (4) has the colony with the auxotrophic mutation that can be picked 
and grown (6). Further analysis of the isolated strain is necessary to determine the 
nature of the auxotrophic mutation. 


plasmid pUC19 is 2,686 bp long and contains an Amp r gene; a segment of 
the (3-galactosidase gene dacZ') of the lactose operon of E. coli under the 
control of the regulatable lac promoter; a lacl gene that produces a repressor 
protein that regulates the expression of the lacZ' gene from the lac pro¬ 
moter; a short DNA sequence with many unique cloning sites (e.g., EcoRI, 
SacI, Kpnl, Xmal, Smal, BamHI, Xbal, Sail, Hindi, AccI, BspMI, PstI, SphI, 
and Hindlll), which is called a multiple cloning site (multiple cloning 
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sequence, multicloning site, or polylinker); and the origin of DNA replica¬ 
tion from pBR322 (Fig. 3.13). 

The pUC19 selection procedure has the following rationale. When cells 
carrying unmodified pUC19 are grown in the presence of isopropyl-p-D- 
thiogalactopyranoside (IPTG), which is an inducer of the lac operon, the 
protein product of the lacl gene can no longer bind to the promoter-oper¬ 
ator region of the lacZ' gene, so the lacZ' gene in the plasmid is transcribed 
and translated. The LacZ' protein combines with a protein (LacZa) that is 
encoded by chromosomal DNA to form an active hybrid p-galactosidase. 
In pUC19, the multiple cloning site is incorporated into the lacZ' gene in the 
plasmid without interfering with the production of the functional hybrid 
p-galactosidase. Finally, if the substrate 5-bromo-4-chloro-3-indolyl-p-D- 
galactopyranoside (X-Gal) is present in the medium, it is hydrolyzed by 
this hybrid P-galactosidase to a blue product. Under these conditions, colo¬ 
nies containing unmodified pUC19 appear blue. 

For a pUC19 cloning experiment, DNA from a source organism is cut 
with one of the restriction endonucleases for which there is a recognition 
site in the multiple cloning site (Fig. 3.14). This source DNA is mixed with 
pUC19 plasmid DNA that has been treated with the same restriction endo¬ 
nuclease and then with alkaline phosphatase. After ligation with T4 DNA 
ligase, the reaction mixture is introduced by transformation into a host cell 
which can synthesize that part of p-galactosidase (LacZa) that combines 
with the product of the lacZ' gene to form a functional enzyme. The treated 
host cells are plated onto medium that contains ampicillin, IPTG, and 
X-Gal. 

Nontransformed cells cannot grow in the presence of ampicillin. Cells 
with recircularized plasmids can grow with ampicillin in the medium, and 
because they can form functional p-galactosidase, they produce blue colo¬ 
nies. In contrast, host cells that carry a plasmid-cloned DNA construct 
produce white colonies on the same medium. The reason for this is that, 
usually, DNA inserted into a restriction endonuclease site within the mul¬ 
tiple cloning site disrupts the correct sequence of DNA codons (reading 
frame) of the lacZ' gene and prevents the production of a functional LacZ' 
protein, so no active hybrid p-galactosidase is produced (Fig. 3.14). In the 


FIGURE 3.13 Genetic map of the plasmid cloning vector pUC19. The multiple cloning 
site contains unique sites for the restriction endonucleases that are used for the 
insertion of cloned DNA. The plasmid contains an Amp 1 gene, an origin of replica¬ 
tion that functions in E. coli, and the lacl gene, which produces a repressor that 
blocks the transcription of the lacZ’ gene in the absence of the inducer IPTG. The 
complete DNA sequence of pUC19 is 2,686 bp long. 
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FIGURE 3.14 Plasmid pUC19 multiple cloning site. The multiple cloning site (upper¬ 
case nucleotides) is inserted into the lacZ' gene (lowercase nucleotides). Some of the 
unique restriction endonuclease sites of the multiple cloning site are named and 
demarcated by horizontal lines. The double arrows mark the /acZ'-multiple cloning 
site DNA sequence that encodes the first 26 amino acids of the hybrid LacZ' protein. 
Insertion of DNA into any of the unique restriction endonuclease sites of the mul¬ 
tiple cloning site changes the reading frame of the lacZ' gene and prevents the cor¬ 
rect translation of the LacZ' protein. 


absence of (3-galactosidase activity, the X-Gal in the medium is not con¬ 
verted to the blue compound, so these colonies remain white. The white 
(positive) colonies subsequently must be screened to identify those that 
carry a specific target DNA sequence. 

In addition to ampicillin and tetracycline, other antibiotics are used as 
selective agents in various cloning vectors (Table 3.5). Moreover, a number 
of inventive selection systems have been devised to identify cells with 
insert-vector constructs. For example, a vector that is derived from the 
pUC series carries a gene that, when expressed, encodes a protein that kills 
the cell (suicide protein). This cell-killing gene is fused in the correct 
reading frame to the lacZ' gene so that it is transcribed from the regulatable 
lacZ' gene promoter. A cell with an intact plasmid and no IPTG in the 
medium does not synthesize the suicide protein. Cells with a plasmid and 
no insert, in the presence of IPTG, synthesize the suicide protein and are 
killed. With an insert and IPTG, a nonfunctional suicide protein is pro¬ 
duced because the insert, in all likelihood, disrupts the reading frame of the 
suicide gene. Nontransformed cells are sensitive to an antibiotic, whereas 
transformed cells have as part of the vector a gene that confers resistance 
to the antibiotic. In other words, in this case, the only surviving cells in the 
presence of IPTG and antibiotic are those that carry a plasmid with a DNA 
insert. 

Although a number of vectors have ingenious designs, in principle they 
all retain the two basic requirements of recombinant DNA technology. There 
is both a choice of cloning sites and an easy way of identifying cells with 
plasmid-cloned DNA constructs. It should be noted that unique restriction 
endonuclease sites have a dual function in recombinant DNA research. They 
are required for inserting DNA into a cloning vector, and they allow an 
inserted DNA sequence to be recovered from the vector. In other words, 
after a piece of DNA has been cloned into a plasmid that was cut with the 
same restriction endonuclease, it can be retrieved by cutting the purified 
plasmid-cloned DNA construct with that restriction endonuclease because 
the insertion event recreates the recognition site at each end of the cloned 
DNA sequence. A recovered DNA fragment can be cloned into specialized 
cloning vectors for DNA sequencing or vectors that have been specifically 
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designed to achieve high levels of expression (transcription and translation) 
of the cloned gene. In fact, thousands of vectors have been developed for a 
variety of purposes and for many different organisms. 

Even though E. coli, which is well-known as a laboratory organism, is 
used for all routine molecular-cloning procedures, other bacteria, such as 
Bacillus subtilis and Agrobacterium tumefaciens , often act as the final host 
cells. For many applications, cloning vectors that function in E. coli may be 
provided with a second origin of replication that enables the plasmid to 
replicate in the alternative host cell. With these shuttle cloning vectors, the 
initial cloning steps are conducted with E. coli before the fully developed 
construct is introduced into a different host cell. In addition, a number of 
plasmid vectors have been constructed with a single broad-host-range 
origin of DNA replication instead of a narrow-host-range origin of replica¬ 
tion. These vectors can be used with a variety of microorganisms. 

Shuttle vectors have some drawbacks. The addition of a segment of 
DNA containing the second origin of replication increases the size of the 
vector and reduces the amount of DNA that can be inserted, and in some 
instances, shuttle vectors are not efficiently propagated in the host cell. 
Also, broad-host-range cloning vectors can be unstable and can be lost 
from a preferred host cell. To overcome these limitations, a system was 
devised for cloning a DNA insert into an E. coli- based plasmid and then 
combining the part of the plasmid that carries both the cloned DNA and an 
antibiotic resistance gene with a part of a host cell-specific plasmid carrying 
its own origin of replication. The first step in the creation of this shuttle 
vector requires engineering two different recognition sites for the restric¬ 
tion endonuclease Sfil. The sequence of this site is ccggnnnnnccgg' where (j 
represents any base pair (Fig. 3.15A). Two different Sfil sites (Sfil x and Sfil y ) 
are designed to have different variable sequences so that after digestion 
with Sfil, the extensions of the Sfil x site are not complementary to those of 
the Sfil y site (Fig. 3.15B). Next, the two Sfil sequences are inserted into both 
the E. coli- based and host cell-specific plasmids so that they flank the region 
containing the antibiotic resistance gene and the cloning site on the E. coli- 
based plasmid (Fig. 3.16A) and the origin of replication on the host cell- 
specific plasmid (Fig. 3.16B). After a DNA sequence is cloned into the E. 
coli- based plasmid and the construct is grown in E. coli, the £. co/z-based 
and host cell-specific plasmids are purified separately, mixed, and digested 
with Sfil. The Sfil x —Sfil x and Sfil y -Sfil y extensions base pair, and after liga¬ 
tion, several different "chimeric" circular DNA molecules are formed. The 


TABLE 3.5 Some antibiotics commonly used as selective agents 


Antibiotic (abbreviations) 

Description 

Ampicillin (Ap, Amp) 
Hygromycin B (HygB) 
Kanamycin (Km, Kan) 

Neomycin (Nm, Neo) 
Streptomycin (Sm, Str) 

Tetracycline (Tc, Tet) 

Inhibits cell wall formation; inactivated by (S-lactamase 

Blocks translocation from amino acyl site to peptidyl site; inactivated by a phosphotransferase 

Binds to 30S subunit and prevents translocation from aminoacyl-tRNA site to peptidyl site; 
inactivated by a phosphotransferase 

Binds to 30S subunit and inhibits protein synthesis; inactivated by a phosphotransferase 

Blocks protein initiation complex formation and causes misreading during translation; inacti¬ 
vated by a phosphotransferase 

Prevents binding of aminoacyl-tRNA to 30S ribosomal subunit; resistance gene encodes an 
inner cell membrane protein that passes the antibiotic out of the cell and blocks the passage 
of the antibiotic through the cell wall 
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mixture is transformed into the host cell and selected on medium con¬ 
taining an antibiotic, i.e., chloramphenicol. Cells without any plasmid and 
those with plasmids without the chloramphenicol resistance gene cannot 
grow in the presence of chloramphenicol. Also, plasmids that do not carry 
an origin of replication or that carry the origin of replication from the E. 
coli -based plasmid are not replicated in the host cell and therefore are not 
maintained. Only cells transformed with a chimeric plasmid that carries 
the part of the E. co//-based plasmid with the chloramphenicol resistance 
gene and the cloned gene joined to the fragment of the host cell-specific 
plasmid that contains the origin of replication that functions in the host cell 
are resistant to chloramphenicol and, consequently, can grow (Fig. 3.16C). 
The "Sfil x —Sfil y " procedure, which can be adapted for any host cell that has 
a plasmid, is called vector backbone exchange. 
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FIGURE 3.15 (A) Sfil recognition site. The 
arrows mark the cleavage sites, and j)j 
represents any base pair. (B) The Sfil x 
and Sfil y recognition sites are designed 
so that after digestion with the restric¬ 
tion endonuclease Sfil, the Sfil x exten¬ 
sions will not base pair with the Sfil y 
extensions. The nucleotide differences 
between the Sfil x (red) and Sfil y (orange) 
recognition sites are noted. 


Creating and Screening a Library 

Making a Genomic Library 

One of the fundamental objectives of molecular biotechnology is the isola¬ 
tion of genes that encode proteins for industrial, agricultural, and medical 
applications. In prokaryotic organisms, structural genes form a continuous 
coding domain in the genomic DNA, whereas in eukaryotes, the coding 
regions (exons) of structural genes are separated by noncoding regions 
(introns). Consequently, different cloning strategies have to be used for 
cloning prokaryotic and eukaryotic genes. 

In a prokaryote, the desired sequence (target DNA, or gene of interest) 
is typically a minuscule portion (about 0.02%) of the total chromosomal 
DNA. The problem, then, is how to clone and select the targeted DNA 
sequence. To do this, the complete DNA of an organism, i.e., the genome, 
is cut with a restriction endonuclease, and each fragment is inserted into a 
vector. Then, the specific clone that carries the target DNA sequence must 
be identified, isolated, and characterized. The process of subdividing 
genomic DNA into clonable elements and inserting them into host cells is 
called creating a library (clone bank, gene bank, or genomic library). A 
complete library, by definition, contains all of the genomic DNA of the 
source organism. 

One way to create a genomic library is to first treat the DNA from a 
source organism with a four-cutter restriction endonuclease, e.g., Sau3AI, 
which theoretically cleaves the DNA approximately once in every 256 bp. 
The conditions of the digestion reaction are set to give a partial, not a com¬ 
plete, digestion. In this way, all possible fragment sizes are generated (Fig. 
3.17). Partial digestions are carried out with a low concentration of restric¬ 
tion endonuclease or shortened incubation times. The fragments become 
smaller as the reaction period is extended (Fig. 3.18). To ensure that the 
entire genome, or most of it, is contained within the clones of a library, the 
sum of the inserted DNA in the clones of the library should be three or more 
times the amount of DNA in the genome. For example, if a genome has 4 x 
10 6 bp and the average size of an insert is 1,000 bp, then 12,000 clones are 
required for threefold coverage, i.e., 3[(4 x 10 6 )/10 3 ]. For the human genome 
(3.3 x 10 9 bp), about 80,000 bacterial artificial chromosome (BAC) clones that 
have an average insert size of 150,000 bp compose a library with fourfold 
coverage, i.e., 4[(3.3 x 10 9 )/(15 x 10 4 )]. From a statistical perspective, the 
relationship N = ln(l - P)/ln(l -f) (where N is the number of clones, P is the 
probability of finding a specific gene, and / is the ratio of the length of the 
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A Cloned gene B 





Mix and digest with Sfil 
Ligate and transform host cells 

f 

C Cloned gene 



FIGURE 3.16 Vector backbone exchange. Shown are an E. coli -based plasmid with a 
cloned DNA sequence, a chloramphenicol resistance gene (Cm 1 ), and an E. coZz-spe- 
cific origin of replication (ori E ) (A) and a host cell-specific plasmid with a host-spe¬ 
cific origin of replication (orz H ) and an erythromycin resistance gene (Ery r ) (B) that 
have each been engineered with Sfil x and Sfil y recognition sites. Treatment of the 
plasmids shown in panels A and B with Sfil generates two fragments from each 
plasmid with Sfil x and Sfil y extensions. Several different circular chimeric DNA mol¬ 
ecules are formed after base pairing between complementary extensions and liga¬ 
tion (not shown). (C) After transformation of the mixture of chimeric DNA molecules 
into the host cell, only cells carrying a chimeric plasmid that has an origin of replica¬ 
tion that functions in the host cell, as well as the cloned DNA sequence and the 
chloramphenicol resistance gene from the E. coli -based plasmid, will be selected on 
medium containing chloramphenicol. 


average insert to the size of the entire genome) provides an estimate of the 
number of clones that is necessary for a comprehensive genomic library. On 
this basis, about 700,000 clones are required for a 99% chance of discovering 
a particular sequence in a human genomic library with an average insert 
size of 20 kb. Finally, because restriction endonuclease sites are not ran¬ 
domly located, some fragments may be too large to be cloned. When this 
occurs, it may be difficult or even impossible to find a specific target DNA 
sequence because the library is incomplete. This problem can be overcome 
by forming libraries with different restriction endonucleases. Clearly the 
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FIGURE 3.17 Partial digestion of a fragment of DNA with a type II restriction endo¬ 
nuclease. Partial digestions are usually performed by varying either the length of 
time or the amount of enzyme used for the digestion. In some of the DNA mole¬ 
cules, the restriction endonuclease has cut at all sites (each labeled RE1). In other 
molecules, fewer cleavages have occurred. The desired outcome is a sample with 
DNA molecules of all possible lengths. 


number of clones in a genomic library depends on the extent of the cov¬ 
erage, the size of the genome of the organism (Table 3.6), and the average 
size of the insert in the vector. 

After a library is created, the clone(s) with the target sequence must be 
identified. Four popular methods of identification are used: DNA hybrid¬ 
ization with a labeled DNA probe followed by radiographic screening for 
the probe label, immunological screening for the protein product, assaying 
for protein activity, and functional (genetic) complementation. 

Screening by DNA Hybridization 

The presence of a target nucleotide sequence in a DNA sample can be 
determined with a DNA probe. This procedure, called DNA hybridization, 
depends on the formation of stable base pairs between the probe and the 
target sequence. DNA hybridization is feasible because naturally occurring 
double-stranded DNA can be converted into single-stranded DNA by heat 
or alkali treatment. Ideating DNA breaks the hydrogen bonds that hold the 
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FIGURE 3.18 Effect of increasing the time of restriction endo¬ 
nuclease digestion of a DNA sample. (A) The restriction 
endonuclease sites (arrows) of a DNA molecule are shown. 
(B) As the duration of restriction endonuclease treatment is 
extended, cleavage occurs at an increased number of sites 
(lanes 1 to 5). Lane 1 represents the size of the DNA mole¬ 
cule at the time of addition of restriction endonuclease. 
Lanes 2 to 5 depict the extents of DNA cleavage after 
increasing exposures to restriction endonuclease. 


TABLE 3.6 DNA contents of various organisms 


Organism 

Genome size (millions of base pairs) 

Mycoplasma genitalium 

0.58 

Methanococcus jannaschii 

1.66 

Haemophilus influenzae Rd 

1.83 

Neisseria meningitidis 

2.27 

Lactococcus lactis 

2.36 

Mycobacterium leprae 

3.26 

Vibrio cholerae 

4.00 

B. subtilis 

4.20 

E. coli K-12 

4.60 

E. coli 0157:H7 

5.50 

Pseudomonas aeruginosa 

6.30 

Mesorhizobium loti 

7.59 

Saccharomyces cerevisiae 

13 

Caenorhabditis elegans 

97 

Arabidopsis thaliana 

125 

Drosophila melanogaster 

165 

Fugu rubripes 

400 

Solanum tuberosum 

840 

Zea mays 

2,700 

Mus musculus 

3,000 

Homo sapiens 

3,300 

Hordeum vulgare 

5,500 

Triticum aestivum 

17,300 






























72 


CHAPTER 3 


1 Prepare target DNA 

ATCGTAGTCGTAGGTCGGTTAGCTTGAACC TTTCCCCAAAAGGGGGCCCCCTTTTAAAA 

T AGCAT CAGCAT CCAGCCAAT CGAAC T T GG AAAGGGGT T T TCCCCCGGGGGAAAAT T T T 


Extract 

-Denature- 

Immobilize 

f f 

ATCGTAGTCGTAGGTCGGTTAGCTTGAACC TTTCCCCAAAAGGGGGCCCCCTTTTAAAA 

TAGCATCAGCATCCAGCCAATCGAACTTGG AAAGGGGT T T T CCCCCGGGGGAAAAT T T T 


2 Prepare probe DNA 

TAGGTCGG 
ATCCAGCC 

Label 

Denature 

f 

TAGGTCGG* 

ATCCAGCC* 


3 Hybridization 

ATCCAGCC* 

ATCGTAGTCGTAGGTCGGTTAGCTTGAACC TTTCCCCAAAAGGGGGCCCCCTTTTAAAA 


TAGGTCGG* 

TAGCATCAGCATCCAGCCAATCGAACTTGG AAAGGGG T T T T CCCCCGGGGGAAAAT T T T 

FIGURE 3.19 DNA hybridization. (1) The DNA of samples containing the putative 
target DNA is denatured, and the single strands are kept apart, usually by binding 
them to a solid support, such as a nitrocellulose or nylon membrane. (2) The probe, 
which is often 100 to 1,000 bp in length, is labeled, denatured, and mixed with the 
denatured putative target DNA under hybridization conditions. (3) After the 
hybridization reaction, the membrane is washed to remove nonhybridized probe 
DNA and assayed for the presence of any hybridized labeled tag. If the probe does 
not hybridize, no label is detected. The asterisks denote the labeled tags (signal) of 
the probe DNA. 


bases together (denaturation) but does not affect the phosphodiester bonds 
of the DNA backbone. If the heated solution is rapidly cooled, the strands 
remain single stranded. However, if the temperature of a heated DNA solu¬ 
tion is lowered slowly, the double-stranded, helical conformation of DNA 
can be reestablished because of the base pairing of complementary nucle¬ 
otides (renaturation). The process of heating and slowly cooling double- 
stranded DNA is called annealing. When DNA fragments from different 
sources with some shared (homologous) sequences are mixed, heated to 
100°C, and slowly cooled, there will be some hybrid DNA molecules 
among the annealed products, that is, double-stranded DNA in which the 
strands come from the different sources. 

In general, for a DNA hybridization assay, the target DNA is denatured 
and the single strands are irreversibly bound to a matrix, e.g., nitrocellulose 
or nylon. Then, the single strands of a DNA probe, which are labeled with 
either a radioisotope or another tagging system, are incubated with the 
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bound DNA sample. If the sequence of nucleotides in the DNA probe is 
complementary to a nucleotide sequence in the sample, then base pairing, 
i.e., hybridization, occurs (Fig. 3.19). The hybridization can be detected by 
autoradiography (Box 3.2) or other visualization procedures depending on 
the nature of the probe label. If the nucleotide sequence of the probe does 
not base pair with a DNA sequence in the sample, then no hybridization 
occurs and the assay gives a negative result. Generally, probes range in 
length from 100 to more than 1,000 bp, although both larger and smaller 
probes can be used. Depending on the conditions of the hybridization reac¬ 
tion, stable base pairing requires a match of >80% within a segment of 50 
bases. 

DNA probes can be labeled in various ways. One strategy, which is 
called the random-primer method, utilizes a mixture of synthetic random 
oligonucleotides (oligomers) containing all possible combinations of 
sequences of 6 nucleotides (hexamers) that act as primers for DNA syn¬ 
thesis. On the basis of the chance occurrence of complementary sequences, 
some of the oligomers in the sample will hybridize to complementary 
sequences on the unlabeled probe DNA template (Fig. 3.20). After the oli¬ 
gomer sample is mixed with the denatured probe template DNA, the four 
deoxyribonucleotides (deoxyribonucleoside triphosphates [dNTPs]) and a 
portion of E. coli DNA polymerase I called the Klenow fragment are added. 
The dNTPs are deoxyadenosine triphosphate (dATP), deoxyribosylthy- 
mine triphosphate (dTTP), deoxyguanosine triphosphate (dGTP), and 


BOX 3.2 


Autoradiography 

A utoradiography is used to detect 
the location of a radiolabeled 
entity in a cell or sample of fraction¬ 
ated macromolecules. In principle, 
autoradiography consists of placing a 
radioactive source next to a radiosen¬ 
sitive photographic film that contains 
silver bromide. The energy from the 
decay of the radioisotope hits the pho¬ 
tographic emulsion and produces elec¬ 
trons that are trapped by specks of 
silver bromide crystals in the emul¬ 
sion. The negatively charged specks 
attract silver ions, and metallic silver 
is formed. The grains of metallic silver 
are visualized by developing the pho¬ 
tographic film. Thus, an exposed dark 
region on a developed film indicates 
that the underlying material was radi¬ 
olabeled. Parenthetically, fluorography 
is the term used for the exposure of 
light-sensitive photographic film to 
molecules that directly or indirectly 
generate light as the source of energy 
that reduces silver in the photographic 
emulsion. 


Proteins and nucleic acids that are 
radiolabeled and separated by gel 
electrophoresis can be visualized by 
placing an X-ray film on a dried gel 
and developing the film after a suit¬ 
able exposure time. All autoradio¬ 
graphic steps are carried out in the 
dark to avoid inadvertent exposure of 
the X-ray film to light. A number of 
autoradiographic techniques have 
been devised for the quantitative and 
qualitative analysis of proteins and 
nucleic acids. 

One of the major applications of 
autoradiography is the detection of 
the hybridization of a radiolabeled 
DNA probe to a DNA molecule that 
has been electrophoretically fraction¬ 
ated. However, DNA molecules in a 
gel are not accessible to hybridization 
with a DNA probe. Consequently the 
DNA molecules in the gel are trans¬ 
ferred by blotting or electrotransfer to 
a nitrocellulose or nylon membrane. 
The transfer process retains the same 
positions on the membrane as the 
DNA molecules had in the gel. The 
DNA molecules that are transferred to 


a membrane are denatured, bound to 
the membrane, and hybridized with a 
radiolabeled DNA probe. Auto¬ 
radiography of the membrane reveals 
whether the probe hybridized to a 
particular DNA band(s). 

The transfer of DNA from a gel to a 
membrane is called Southern blotting 
(Southern DNA blotting) after Edwin 
Southern, who devised the original 
DNA blotting strategy Northern blot¬ 
ting and Western blotting are methods 
for the transfer of RNA and protein, 
respectively, from a gel to a mem¬ 
brane. The terms "Northern" and 
"Western" have nothing to do with 
direction and were coined by molec¬ 
ular biologists both to give Edwin 
Southern further credit for developing 
the notion of blotting macromolecules 
from a gel to a membrane and to dis¬ 
tinguish the macromolecules that are 
transferred. The designations 
"Northern" and "Western" are also 
examples of molecular biology humor. 
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Source 
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■ Hybridization 
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FIGURE 3.20 Production of labeled probe DNA by the random-primer method. The 
duplex DNA containing the sequence that is to act as the probe is denatured, and 
an oligonucleotide sample containing all possible sequences of 6 nucleotides is 
added. It is a statistical certainty that some of the molecules of the oligonucleotide 
mixture will hybridize to the unlabeled, denatured probe DNA. In the presence of 
Klenow fragment and the four dNTPs, one of which is labeled with a tag (*), the 
base-paired oligonucleotides act as primers for DNA synthesis. The synthesized 
DNA is labeled and used as a probe to detect the presence of a DNA sequence in a 
DNA sample. In this case, the labeled probe consists of a number of separate DNA 
molecules that together constitute almost the entire sequence of the original unla¬ 
beled template DNA. 
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deoxycytidine triphosphate (dCTP). The Klenow fragment retains both 
DNA polymerase and 3' exonuclease activities but lacks the 5' exonuclease 
activity that is normally associated with E. coli DNA polymerase I (Fig. 
3.21). The 3' exonuclease is retained because it reduces the misincorpora- 
tion of erroneous dNTPs during the synthesis of the new DNA strand; 
however, 5' exonuclease activity is abolished because it would degrade 
some of the newly synthesized DNA. With the available 3' hydroxyl groups 
provided by the base-paired random primers and the strands of the probe 
as templates, new DNA synthesis occurs (Fig. 3.20). If a radioactive label is 
used, then one of the dNTPs contains the isotope 32 P in the a-position phos¬ 
phate. Autoradiography is used to determine whether the labeled probe 
sequences hybridize to sequences of a target DNA sample. Often today, the 
deoxyribonucleotides, including the labeled dNTP, are incorporated into 
the probe sequence using the polymerase chain reaction (PCR), generating 
high yields of labeled probe DNA (see chapter 4). 

For nonisotopic detection of hybridization, biotin can be attached to 
one of the four dNTPs that are incorporated during the DNA synthesis 
step. When a biotin-labeled probe hybridizes to the sample DNA, detection 

FIGURE 3.21 Schematic representation of the enzymatic activities of E. coli DNA poly¬ 
merase I. (A) The polymerase (red) adds deoxyribonucleotides to the 3' hydroxyl 
groups of the growing chains. (B) The 5' exonuclease (blue) removes successive 
nucleotides from 5' phosphate ends. (C) The 3' exonuclease (yellow) removes suc¬ 
cessive nucleotides from 3' hydroxyl ends. 


A 



5'P 


OH 3' 


3'OH 


B 




OH 3' 


3' HO 


C 



5'P 



P 5' 










76 


CHAPTER 3 


is based on the binding of an intermediary compound (e.g., streptavidin) to 
biotin (see chapter 9). The intermediary compound carries an appropriate 
enzyme that, depending on the assay system, may either form a chro- 
mogenic (colored) molecule that can be visualized directly or produce a 
chemiluminescent response that can be detected by autoradiography. 

There are at least two possible sources of probes for screening a 
genomic library First, cloned DNA from a closely related organism (a het¬ 
erologous probe) can be used. In this case, the conditions of the hybridiza¬ 
tion reaction can be adjusted to permit considerable mismatch between the 
probe and the target DNA to compensate for the natural differences 
between the two sequences. Second, a probe can be produced by chemical 
synthesis. The nucleotide sequence of a synthetic probe is based on the 
probable nucleotide sequence that is deduced from the known amino acid 
sequence of the protein encoded by the target gene. 

Genomic libraries are often screened by plating out the transformed 
cells on the growth medium of a master plate and then transferring sam¬ 
ples of each colony to a solid matrix, such as a nitrocellulose or nylon 
membrane. The cells on the membrane are broken open (lysed), the protein 
is removed, and the DNA is bound to the membrane. At this stage, a 
labeled probe is added, and if hybridization occurs, signals are observed on 
an autoradiograph. The colonies from the master plate that correspond to 
samples containing hybridized DNA are then isolated and cultured (Fig. 
3.22). Because most libraries are created from partial digestions of genomic 
DNA, a number of colonies may give a positive response to the probe. The 
next task is to determine which clone, if any, contains the complete 
sequence of the target gene. Preliminary analyses that use the results of gel 
electrophoresis and restriction endonuclease mapping reveal the length of 
each insert and identify those inserts that are the same and those that share 
overlapping sequences. If an insert in any one of the clones is large enough 
to include the full gene, then the complete gene can be recognized after 
DNA sequencing because it will have start and stop codons and a contig¬ 
uous set of nucleotides that code for the target protein. Alternatively, a gene 
can be assembled by using overlapping sequences from different clones. 

Unfortunately, there is no guarantee that the complete sequence of a 
target gene will be present in a particular library. If the search for an intact 
gene fails, then another library can be created with a different restriction 
endonuclease and screened with either the original probe or probes derived 
from the first library. On the other hand, as discussed below, libraries that 
contain DNA fragments larger than the average prokaryotic gene can be 
created with specialized vectors to increase the chance that some members 
of the library will carry a complete version of the target gene. 

Screening by Immunological Assay 

Alternative methods are used to screen a library when a DNA probe is not 
available. For example, if a cloned DNA sequence is transcribed and trans¬ 
lated, the presence of the protein, or even part of it, can be determined by 
an immunological assay. Technically, this procedure has much in common 
with a DNA hybridization assay. All the clones of the library are grown on 
several master plates. A sample of each colony is transferred to a known 
position on a matrix, where the cells are lysed and the released proteins are 
attached to the matrix. The matrix with the bound proteins is treated with 
an antibody (primary antibody) that specifically binds to the protein 
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FIGURE 3.22 Screening a library with a labeled DNA probe (colony hybridization). 
Cells from the transformation reaction are plated onto solid agar medium under 
conditions that permit transformed, but not nontransformed, cells to grow. (1) 
From each discrete colony formed on the master plate, a sample is transferred to a 
solid matrix, such as a nitrocellulose or nylon membrane. The pattern of the colo¬ 
nies on the master plate is retained on the matrix. (2) The cells on the matrix are 
lysed, and the released DNA is denatured, deproteinized, and irreversibly bound 
to the matrix. (3) A labeled DNA probe is added to the matrix under hybridization 
conditions. After the nonhybridized probe molecules are washed away, the matrix 
is processed by autoradiography to determine which cells have bound labeled 
DNA. (4) A colony on the master plate that corresponds to the region of positive 
response on the X-ray film is identified. Cells from the positive colony on the 
master plate are subcultured because they may carry the desired plasmid-cloned 
DNA construct. 


encoded by the target gene. Following the interaction of the primary anti¬ 
body with the target protein (antigen), any unbound antibody is washed 
away, and the matrix is treated with a second antibody (secondary anti¬ 
body) that is specific for the primary antibody In many assay systems, the 
secondary antibody has an enzyme, such as alkaline phosphatase, attached 
to it. After the matrix is washed, a colorless substrate is added. If the sec¬ 
ondary antibody has bound to the primary antibody the colorless substrate 
is hydrolyzed by the attached enzyme and produces a colored compound 
that accumulates at the site of the reaction (Fig. 3.23). 

The colonies on the master plate that correspond to positive results 
(colored spots) on the matrix contain either an intact gene or a portion of 
the gene that is large enough to produce a protein product that is recog- 
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nized by the primary antibody After detection by immunoassay of genomic 
DNA libraries, the positive clones must be characterized to determine 
which, if any, carry a complete gene. 

Screening by Protein Activity 

DNA hybridization and immunological assays work well for many kinds 
of genes and gene products. If the target gene produces an enzyme that is 
not normally made by the host cell, a direct (in situ) plate assay can be 


FIGURE 3.23 Immunological screening of a gene library (colony immunoassay). Cells 
from the transformation reaction are plated onto solid agar medium under condi¬ 
tions that permit transformed, but not nontransformed, cells to grow. (1) From the 
discrete colonies formed on this master plate, a sample from each colony is trans¬ 
ferred to a solid matrix, such as a nitrocellulose or nylon membrane. (2) The cells on 
the matrix are lysed, and their proteins are bound to the matrix. (3) The matrix is 
treated with a primary antibody that binds only to the target protein. (4) Unbound 
primary antibody is washed away, and the matrix is treated with a secondary anti¬ 
body that binds only to the primary antibody. (5) Any unbound secondary antibody 
is washed away, and a colorimetric reaction is carried out. The reaction can occur 
only if the secondary antibody, which is attached to an enzyme (E) that performs 
the reaction, is present. (6) A colony on the master plate that corresponds to a posi¬ 
tive response on the matrix is identified. Cells from the positive colony on the 
master plate are subcultured because they may carry the plasmid-insert DNA con¬ 
struct that encodes the protein that binds the primary antibody. 
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devised to identify members of a library that carry the particular gene 
encoding that enzyme. The genes for a-amylase, endoglucanase, 
(3-glucosidase, and many other enzymes from various organisms have been 
isolated in this way. This approach has proven effective for isolating genes 
encoding biotechnologically useful enzymes from microorganisms present 
in environmental samples. Most of the organisms contained in these sam¬ 
ples cannot be grown in the laboratory outside of their natural environ¬ 
ment. However, the total genomic DNA from these organisms can be 
extracted directly from the sample, for example, a soil sample, and used to 
prepare a metagenomic library that can be expressed in a host bacterium, 
such as E. coli, and screened for target protein activity. This technique has 
enabled the isolation of many novel proteins with interesting properties 
without the need to first culture the natural host microorganism. 

In some cases, the cells of a genomic library are plated onto medium 
supplemented with a specific substrate; if the substrate is hydrolyzed, a 
colorimetric reaction identifies the colony that carries the target gene (Fig. 
3.24). For example, to detect a cloned bacterial lipase gene, transformed 
cells are grown in the presence of trioleoglycerol and the fluorescent dye 
rhodamine B. As a result of hydrolysis of the substrate, positive colonies 
have orange fluorescent halos when viewed under ultraviolet light. Other 
detection systems do not rely on a colorimetric reaction for discovering a 
particular gene. For example, a transformed cell with a conjugated bile acid 
hydrolase gene from Lactobacillus plantarum was detected by growing the 
members of the genomic library in the presence of bile salts. In this case, a 
hydrolase-positive colony was easily identified because it became sur¬ 
rounded with a ring of precipitated, free bile acids. 

Functional (genetic) complementation is another useful way of isolating 
genes that encode enzymes. In this procedure, the host cell does not have the 
enzyme activity of interest because the gene encoding the enzyme carries a 
mutation that abolishes the activity of the enzyme. Next, a DNA library is 
constructed that carries fragments of genomic DNA from an organism that 
has the desired enzyme activity. Host cells with the genetic deficiency are 
transformed with plasmids of the DNA library, and transformed cells that 
have restored normal enzyme function are selected (Fig. 3.25). The genomic 
DNA that is used to prepare the library can be from a variety of donor organ¬ 
isms, such as the wild-type strain of the host bacterium that carries a func¬ 
tional copy of the gene encoding the enzyme, a different organism that can 
be either another prokaryote or perhaps a eukaryote, or uncultured organ¬ 
isms that are present in an environmental sample. E. coli and yeast cells with 
mutations that affect various biochemical pathways have frequently been 
used as host cells for functional complementation gene cloning. In many of 
these experiments, the protein derived from the cloned gene enables the host 
cell to grow on minimal medium; whereas growth of the mutant cells 
requires the addition of a specific compound to the medium. Furthermore, 
genes that play a role in antibiotic biosynthesis, root nodulation, and other 
processes have been isolated in this way. 

In practice, the availability of genomic sequences from a great number 
of organisms, in which the protein coding regions have been identified and 
in many cases assigned a known or predicted function, has rendered library 
screening unnecessary for some applications. Where the nucleotide sequence 
of a gene of interest is known, the gene can be cloned by designing short 
oligonucleotide primers that bind specifically to complementary sequences 


Colony expressing functional 
enzyme from cloned gene 



Medium containing 
substrate for enzyme 


FIGURE 3.24 Screening a genomic library 
for enzyme activity Cells of a genomic 
library are plated onto solid medium 
containing the substrate for the enzyme 
of interest. If a functional enzyme is 
produced by a colony that carries a 
cloned gene encoding the enzyme, the 
substrate is converted to a colored 
product that can be easily detected. 
Note that other, noncolored colonies on 
the medium also contain fragments of 
the genomic library, but they do not 
carry the gene for the enzyme of 
interest. 
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FIGURE 3.25 Gene cloning by functional complementation. Host cells that are defec¬ 
tive in a certain function, e.g., A-, are transformed with plasmids from a genomic 
library derived from cells that are normal with respect to function A, i.e., A+. Only 
transformed cells that carry a cloned gene that confers the A+ function will grow on 
minimal medium. The cells that show complementation are isolated, and the insert 
of the vector is studied to characterize the gene that corrects the defect in the mutant 
host cells. 


within the target gene in a sample of genomic DNA and from which DNA 
synthesis can be initiated in a reaction known as the PCR (see chapter 4). 


Cloning DNA Sequences That Encode Eukaryotic Proteins 

Special strategies are required for cloning and expressing eukaryotic 
coding regions in prokaryotic cells. Basically, a eukaryotic structural gene 
will not function in a prokaryotic organism because there is no mechanism 
for removing introns from transcribed RNA. Moreover, a eukaryotic DNA 
sequence needs prokaryotic transcriptional and translational control 
sequences to be properly expressed. Parenthetically, a cloned prokaryotic 






















Recombinant DNA Technology 


81 


gene also has the same constraint unless the insert carries regulatory 
sequences that are compatible with the transcription and translation sys¬ 
tems of the host cell. For eukaryotic genes, the "intron problem" is over¬ 
come by synthesizing double-stranded DNA copies (complementary DNA 
[cDNA]) of purified messenger RNA (mRNA) molecules that lack introns 
and cloning the cDNA molecules into a vector to create a cDNA library. 
Often, a cDNA library represents the mRNA sequences from a single spe¬ 
cific tissue. 

Functional eukaryotic mRNA does not have introns because they have 
been removed by the splicing machinery of the eukaryotic cell. The mRNA 
has a G cap at the 5' end and, usually, a string of up to 200 adenine residues 
[poly(A) tail] at the 3' end. The poly(A) tail provides the means for sepa¬ 
rating the mRNA fraction of a tissue from the more abundant ribosomal 
RNA (rRNA) and transfer RNA (tRNA). Short chains of 15 thymidine resi¬ 
dues (oligodeoxythymidylic acid [oligo(dT), or dT ls ]) are attached to cellu¬ 
lose beads, and the oligo(dT)-cellulose beads are packed into a column. 
Total RNA extracted from eukaryotic cells or tissues is passed through the 
oligo(dT)-cellulose column, and the poly(A) tails of the mRNA molecules 
bind by base pairing to the oligo(dT) chains. The tRNA and rRNA molecules, 
which lack poly(A) tails, pass through the column. The mRNA is removed 
(eluted) from the column by treatment with a buffer that breaks the A:T 
hydrogen bonds, thereby releasing the base-paired mRNAs (Fig. 3.26). 

Before the mRNA molecules can be cloned into a vector, they must be 
converted to double-stranded DNA. This synthesis is accomplished by 
using, in succession, two different kinds of nucleic acid polymerases. 
Reverse transcriptase synthesizes the first DNA strand, and E. coli DNA 
polymerase I synthesizes the second (Fig. 3.27). After the mRNA fraction is 
purified, short unattached sequences of oligo(dT) molecules are added to 
the sample, along with the enzyme reverse transcriptase and the four 
dNTPs. An oligo(dT) molecule base pairs with the adenine residues of the 
poly(A) tail of an mRNA and provides an available 3' hydroxyl group to 
prime the synthesis of the first cDNA strand. 

Reverse transcriptase, which is encoded by certain RNA viruses (retro¬ 
viruses), uses an RNA strand as a template while directing deoxyribonucle- 
otides into the growing chain. Thus, when an A, G, C, or U nucleotide of 
the template RNA strand is encountered, the complementary deoxyribo- 
nucleotide (i.e., T, C, G, or A) is incorporated into the growing DNA strand. 
Unfortunately, full-length first cDNA strands are not always produced by 
reverse transcriptase in vitro. Incomplete first DNA strands are due to the 
inability of reverse transcriptase to proceed to the ends of long mRNA tem¬ 
plates, frequent pausing of the enzyme during synthesis, and intrastrand 
base-paired configurations (secondary structure) impeding synthesis. 

The second, complementary DNA strands are generated by treating the 
RNA-DNA (heteroduplex) molecules with ribonuclease H (RNase FI), 
which nicks the mRNA strands, thereby providing a free 3' hydroxyl group 
for initiation of DNA synthesis. As synthesis of the second strand progresses 
from the nicks, the 5' exonuclease activity of DNA polymerase I removes 
any nucleotides that are encountered. The eventual length of the second 
strand depends on the length of the first DNA strand and the location of the 
nick in the mRNA molecule relative to the 3’ end of the first DNA strand. 
The synthesis of second strands is often initiated several nucleotides from 
the 5' end of the mRNA. However, obtaining some full-length cDNAs for 
cloning is usually not a problem because most eukaryotic mRNAs have 
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FIGURE 3.26 Schematic representation of oligo(dT)-cellulose separation of polyade- 
nylated mRNA from total cellular RNA. 


noncoding leader sequences that range from 40 to 80 nucleotides in length 
and precede the coding sequence. In other words, a few of the cDNAs will 
have a complete protein coding sequence and a truncated leader sequence. 
After the second DNA strand synthesis is terminated, the ends of the cDNA 
molecules are blunt-ended (end repaired, or polished) with T4 DNA poly¬ 
merase, which removes 3' extensions and fills in from 3' recessed ends. 
Chemically synthesized adaptors with extensions for a restriction endonu¬ 
clease recognition sequence are ligated to the ends of the cDNA molecules 
to facilitate cloning of the cDNAs into a vector (Fig. 3.27). 

A cDNA library is screened by DNA hybridization to identify clones 
that carry a specific plasmid-cDNA construct. Positive clones must be 
examined further to determine which one(s) carries the complete coding 
sequence for the target protein. Once a full-length cDNA is discovered, the 
sequence can be retrieved and cloned into a vector that is designed to sup¬ 
port its expression in a prokaryotic cell. 

As noted above, the standard cDNA synthesis protocols produce both 
complete (full-length) and incomplete molecules. Unfortunately, much 
time and effort can be spent on identifying clones of a cDNA library with 
full-length sequences. Various strategies have been devised to overcome 
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FIGURE 3.27 Synthesis of cDNA. (A) 
Oligo(dT) primer is added to a purified 
mRNA preparation, and reverse tran¬ 
scriptase with the four dNTPs is used for 
the production of a complementary 
(cDNA) strand from the RNA template. 
Reverse transcriptase does not always 
produce full-length cDNA copies from 
every mRNA template due to mRNA 
secondary structure, i.e., hairpin loops or 
other factors. For second-strand DNA 
synthesis, the mRNA is nicked by RNase 
H, which creates initiation sites for E. coli 
DNA polymerase I. The 5' exonuclease 
activity of DNA polymerase I removes 
both RNA and DNA sequences that are 
encountered as synthesis from the nick 
closest to the 5' end of the mRNA pro¬ 
gresses. (B) The cDNA molecules are end 
repaired with T4 DNA polymerase, and 
adaptors containing restriction enzyme 
recognition sequences are ligated to the 
ends to increase the efficiency of 
cloning. 
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this inconvenience. A method for generating full-length cDNA molecules 
that is based on PCR is presented in chapter 4. Here, a multistep procedure 
for capturing full-length first-strand cDNAs that are used as templates for 
the synthesis of second strands is described (Fig. 3.28). Briefly, the primer 
for first-strand DNA synthesis is a polydeoxythymidylic acid [poly(dT)] 
sequence at the 3' end of a synthetic nucleic acid sequence that also con¬ 
tains a recognition site for a restriction endonuclease. This dual-function 
oligonucleotide is called a primer-adaptor. The disaccharide trehalose is 
added to the reverse transcriptase reaction to stabilize the enzyme and 
allow DNA synthesis to proceed at a high temperature. Secondary struc¬ 
ture (due to intrastrand base pairing) in the mRNAs is disrupted by high 
temperature, and the likelihood that complete molecules will be synthe¬ 
sized is increased. In addition, one of the four dNTPs in the reverse tran¬ 
scriptase reaction mixture is 5-methyl-dCTP, which is incorporated into the 
growing strand. The presence of methyl groups in one strand (hemimethy- 
lation) of double-stranded DNA protects the DNA from being cleaved by 
certain restriction endonucleases. This DNA modification is important for 
the final step of the procedure. 

After the first strand is synthesized, biotin is chemically attached to the 
ribose sugars of the cap nucleotide at the 5' end and the nucleotide at the 3' 
end of the mRNA molecules. Deoxyribose is not biotinylated under these 
conditions. Next, the hybrid RNA-DNA molecules are treated with RNase 
I. This enzyme cleaves single-stranded RNA; it does not attack RNA that is 
base paired with DNA or DNA strands. As a result, both the 5' single- 
stranded regions of the mRNAs with incomplete cDNAs and the nonpaired 
segments of the poly(A) tails of the mRNA molecules are degraded. The 
mRNA strands of completely synthesized cDNA strands are not affected by 
this enzyme. The sample is then mixed with streptavidin-coated magnetic 
beads. Biotin has a high affinity for streptavidin. After RNase I treatment, 
the only biotinylated RNA-DNA hybrid molecules that remain are those 
with a biotinylated cap. In other words, only full-length cDNAs are cap¬ 
tured because the 5' end of the mRNA is base paired with the cDNA and is 
therefore protected from cleavage by RNase I, leaving the biotin molecule 
attached. A magnet is used to separate out the beads from solution. Next, 
the RNA of the streptavidin-bound RNA-DNA hybrids is hydrolyzed with 
RNase H, which cuts base-paired RNA and releases the full-length cDNA 
strands into solution. 

Since the sequences at the 3' ends of the first-strand cDNAs are not 
known, a string of guanine nucleotides is added to the 3' hydroxyl ends to 
provide a complementary sequence for a DNA primer that initiates the 
synthesis of second cDNA strands. The addition of the guanine nucleotides 
is performed by the enzyme terminal deoxynucleotidyl transferase, which 
adds dNTPs sequentially by phosphodiester bond formation to the 3' 
hydroxyl end of a polynucleotide in the absence of a template strand. If 
only one type of dNTP is present in the terminal transferase reaction mix¬ 
ture, in this case dGTP, a homopolymeric tail is formed. After polydeox- 
yguanylic acid [poly(dG)] tailing, a polydeoxycytidylic acid [poly(dC)] 
primer-adaptor is added that base pairs with the poly(dG) tail and pro¬ 
vides an available 3' hydroxyl group for second-strand cDNA synthesis. 
The adaptor portion of this oligonucleotide contains the sequence for 
another restriction endonuclease site. Second-strand cDNA synthesis is car¬ 
ried out at high temperatures with thermostabilized DNA polymerase, 
RNase H, and DNA ligase. As with the synthesis of the first cDNA strand. 
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FIGURE 3.28 Schematic representation of a method for selecting and 
cloning full-length cDNA molecules. (1) Purified mRNA (blue) is mixed 
with an oligonucleotide (primer-adaptor) with an oligo(dT) sequence 
and restriction endonuclease site (yellow). (2) Reverse transcriptase 
synthesizes the first cDNA strand (red) with 5-methyl-dCTP (red boxes) 
as one of the four dNTPs. Both incomplete and complete DNA strands 
are synthesized. (3) Biotin (light-blue boxes labeled B) is attached to the 
ends of mRNA molecules. RNase I-susceptible regions are marked by 
square brackets. (4) Single-stranded segments of RNA are degraded by 
RNase I. (5) Biotinylated molecules bind to streptavidin-coated mag¬ 
netic beads (pink). After RNase I treatment, full-length RNA-DNA 
hybrids are biotinylated, and therefore, they bind to streptavidin. 
Incomplete cDNAs are not biotinylated and do not bind to streptavidin. 
(6) RNase H treatment degrades the RNA of streptavidin-bound 
RNA-DNA hybrids and releases full-length, first-strand cDNA mole¬ 
cules. (7) A poly(dG) tail is added to the 3' hydroxyl end of the first 
cDNA strand. (8) An oligonucleotide (primer-adaptor) with an oligode- 
oxycytidylic acid [oligo(dC)] sequence and a restriction endonuclease 
site (green) pairs with the oligodeoxyguanylic acid [oligo(dG)] tail and 
provides a 3' hydroxyl group for synthesis of the second DNA strand by 
DNA polymerase. (9) During the synthesis of the second cDNA strand 
by DNA polymerase, none of the dNTPs are methylated; RNase H 
removes any remaining base-paired RNA, and DNA ligase joins DNA 
segments that were synthesized internally from bits of mRNA that 
escaped degradation. The oligo(dC) primer-adaptor sequence acts as a 
template for DNA synthesis from the 3' hydroxyl group at the end of the 
poly(dG) tail. (10) The final full-length cDNAs are cut with two restric¬ 
tion endonucleases, one for each end, and cloned into a vector that has 
complementary extensions. Hemimethylation protects the cDNA from 
cleavage by the restriction endonucleases that are used for cloning 
because these enzymes cannot cut methylated restriction endonuclease 
sites. 
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the high temperature diminishes intrastrand folding and increases the effi¬ 
ciency of synthesis of a full-length strand. The dCTP in the dNTP mixture 
used for second-strand synthesis is not methylated. RNase H removes any 
RNA that escaped the previous treatment, and DNA ligase joins segments 
that were primed by remaining bits of RNA. The final product is a full- 
length, double-stranded cDNA that is furnished with nonmethylated 
restriction endonuclease recognition sites at both ends. These sites are 
cleaved with the appropriate restriction endonucleases and cloned into a 
vector that has complementary extensions. Hemimethylation, generated by 
incorporation of methylated nucleotides during the synthesis of the first 
cDNA strand, protects a cDNA from cleavage if it contains the same restric¬ 
tion endonuclease sites that are used for cloning because the restriction 
endonucleases do not cut at methylated sites. 


Vectors for Cloning Large Pieces of DNA 

Bacteriophage X Vectors 

The plasmid-based vectors used for cloning DNA molecules generally 
carry up to 10 kb of inserted DNA. However, for the formation of a library, 
it is often helpful to be able to maintain larger pieces of DNA. To this end, 
various high-capacity cloning systems have been developed (Table 3.7). 
The E. coli virus (bacteriophage, or phage) X has been engineered to be a 
vector for inserts in the range of 15 to 20 kb. 

After bacteriophage X infects E. coli by injection of its DNA, two pos¬ 
sibilities exist. It can enter a lytic cycle that, after 20 minutes, leads to the 
lysis of the host cell and the release of about 100 phage particles. 
Alternatively, the injected bacteriophage X DNA can be integrated into the 
E. coli chromosome as a prophage and can be maintained more or less 
indefinitely as a benign guest (lysogen) through successive cell divisions 
(Fig. 3.29). However, under conditions of nutritional or environmental 
stress, the chromosomally integrated bacteriophage X DNA is excised and 
enters a lytic cycle. The bacteriophage X DNA is about 50 kb in length, of 
which approximately 20 kb is essential for the integration-excision (I/E) 
events. For forming genomic libraries, it was reasoned that this 20 kb of 
DNA could be replaced with 20 kb of cloned DNA and, subsequently, this 
recombined DNA molecule could be perpetuated as a "recombinant" bac¬ 
teriophage X through compulsory lytic cycles. 

To appreciate how bacteriophage X cloning systems function, some 
understanding of the molecular aspects of the lytic cycle is necessary. Am 


TABLE 3.7 Insert capacities of some commonly used vector systems 


Vector system 

Host cell 

Insert capacity (kb) 

Plasmid 

E. coli 

0.1-10 

Bacteriophage X 

X/E. coli 

10-20 

Cosmid 

E. coli 

35-45 

Fosmid 

E. coli 

35—45 

Bacteriophage PI 

E. coli 

80-100 

BAC 

E. coli 

50-300 

PI bacteriophage-derived 
artificial chromosome 

E. coli 

100-300 

Yeast artificial chromosome 

Yeast 

100-2,000 

Human artificial chromosome 

Cultured human cells 

>2,000 
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FIGURE 3.29 Life cycle of bacteriophage X. Lysogeny occurs when the bacteriophage 
X genome (red) becomes integrated into the host chromosome (blue); otherwise, the 
lytic cycle is initiated, resulting in the production and release by cell lysis of about 
100 bacteriophage particles about 20 minutes after infection. 


infective bacteriophage X consists of a tubular protein tail with a protein 
tail fiber and a protein head packed with 50 kb of DNA. The production 
and assembly of the heads and tails and the packaging of DNA are a highly 
coordinated sequence of events. The DNA within the head of a X particle is 
a 50-kb linear molecule with a 12-base single-stranded extension at the 5' 
end of each strand. These extensions are called cohesive (cos) ends because 
they contain sequences that are complementary to each other. After the 
injection of the X DNA through the tail into E. coli, the cos ends base pair to 
form a circular DNA molecule. During the early phase of the lytic cycle, 
DNA replication from the circular molecule creates a linear form of X DNA 
that is composed of several contiguous lengths of 50-kb units, i.e., concate- 
mers (Fig. 3.30A). Each newly assembled head is filled with one 50-kb unit 
of DNA, and finally, the tail assembly is added to complete the formation 
of an infective particle (Fig. 3.30B). The volume of the bacteriophage X head 
is sufficient for about 50 kb. If less than 38 kb of DNA is packed into a head. 
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a noninfective bacteriophage particle is produced. More than 52 kb of DNA 
cannot fit into a head. The location of the cos sequences, which are 50 kb 
apart in the multiple-length linear X DNA, ensures that each head receives 
the correct amount of DNA. Located at the opening of the head is an 
enzyme that recognizes the double-stranded cos sequence and cuts the 
DNA at this site as the DNA is inserted into the head. By mixing purified 
empty heads, bacteriophage X DNA (50 kb), and tail assemblies, infective 
particles are produced in a reaction tube. 

One of the many bacteriophage X cloning vectors that have been 
devised has two BamHI sites that flank the I/E region. When purified DNA 
from this bacteriophage is cut with BamHI, three segments are created. The 
left arm (L region) contains the genetic information for the production of 
heads and tails, the right arm (R region) carries the genes for DNA replica¬ 
tion and cell lysis, and the middle (I/E) fragment has the genes for the 
integration and excision processes. The objective of this genetic-engineering 


FIGURE 3.30 Packaging of bacteriophage X DNA into heads during the lytic cycle. 
(A) DNA replication from the circular form of bacteriophage X creates a linear form 
that has contiguous, multiple lengths (concatemers) of bacteriophage DNA with 
units of approximately 50 kb each. (B) Each newly assembled head is filled with a 
50-kb unit of X DNA before the tail assembly is attached. 
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protocol is to replace the middle, I/E, segment of the X DNA with cloned 
DNA that is approximately 20 kb in length (Fig. 3.31). The BamHI-treated 
bacteriophage X DNA sample is enriched for L and R arms by size fraction¬ 
ation and removal of I/E segments. The source DNA is cut with either 
BamHI or Sau3AI, and DNA pieces that are 15 to 20 kb in length are iso¬ 
lated. The digested source DNA and the L and R regions are combined and 
incubated with T4 DNA ligase. Then, empty bacteriophage heads and tail 
assemblies are added. Under these conditions, 50-kb units of DNA, with 
insert DNA flanked by L and R regions with cos ends, are packaged into 
the heads, and infective bacteriophage particles are formed. Other products 
from the ligation reaction cannot be packaged because they are either too 
large (>52 kb) or too small (<38 kb). Also, any 50-kb DNA molecules 
without a functional origin of replication and cos ends cannot be perpetu¬ 
ated. Recombinant bacteriophage X undergoes lytic cycles and is main¬ 
tained by growth in E. coli. 

To identify the recombinant bacteriophage that carries the target gene 
in the bacteriophage X library, the individual zones of lysis (plaques), each 
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FIGURE 3.31 A bacteriophage X cloning 
system. Bacteriophage X is engineered to 
have two BamHI sites that flank the I/E 
region. For cloning, the source DNA is 
cut with BamHI and fractionated by size 
to isolate pieces that are about 15 to 20 kb 
long. The bacteriophage X DNA is also 
cut with BamHI, and size fractionation 
removes the I/E segment. The L and R 
arms, plus the 15- to 20-kb source DNA 
molecules, are mixed with T4 DNA 
ligase. The ligation reaction produces a 
number of different DNA molecules, 
including ligated source DNA only, com¬ 
bined L and R arms only, and molecules 
that have a source DNA molecule flanked 
by L and R arms. The last molecules are 
packaged into bacteriophage heads in 
vitro, and infective particles are formed 
after the addition of tail assemblies. The 
recombined bacteriophage X is perpetu¬ 
ated by infection of E. coli. Some 50-kb 
source DNA ligation products may be 
packaged into heads, but since this DNA 
lacks both a functional origin of replica¬ 
tion and cos ends, it cannot be perpetu¬ 
ated. Other ligation products are either 
too small or too large to be packaged. For 
some bacteriophage X cloning systems 
(not shown here), high packaging effi¬ 
ciency is achieved by setting the condi¬ 
tions of ligation to favor concatemer 
formation to imitate how the phage 
heads are normally filled. 
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of which contains a recombinant bacteriophage, are lifted onto a matrix 
and screened with either DNA probes or antibodies. For DNA hybridiza¬ 
tion, the bacteriophage proteins are removed and the DNA is denatured 
and bound to a matrix. For immunological assays, the proteins encoded by 
cloned genes that are synthesized during the lytic cycle, along with bacte¬ 
riophage and bacterial proteins, are transferred with the plaque and subse¬ 
quently bound to the matrix. On the basis of the sites on the matrix that 
give positive responses, corresponding plaques on the original plate are 
subcultured to provide a source of selected recombinant bacteriophage that 
is individually cultured in E. coli. 

Cosmids 

Cloning vectors called cosmids can carry about 45 kb of cloned DNA and 
are maintained as plasmids in E. coli. Cosmids combine the properties of 
plasmids and bacteriophage A vectors. For example, the commonly used 
cosmid pLFR-5 has two cos sites (cos ends) from bacteriophage A flanking 
a Seal restriction endonuclease site, a multiple cloning site with six unique 
recognition sites (Hindlll, PstI, Sail, BamHI, Smal, and EcoRI), an origin of 
DNA replication, and a Tet r gene (Fig. 3.32). Pieces of DNA that are approx¬ 
imately 45 kb in length are purified by sucrose density gradient centrifuga¬ 
tion from a partial BamHI digestion of the source DNA (Fig. 3.32). The 
pLFR-5 DNA (~6 kb) is cleaved initially with Seal and then with BamHI. 
The final two DNA samples are mixed and ligated. Some of the ligated 
products have an ~45-kb DNA piece inserted between the two fragments 
that are derived from the digestions of the pLFR-5 DNA. These molecules 
are about 50 kb long and have cos sequences that are about 50 kb apart. 
Consequently, these DNA constructs are successfully packaged into bacte¬ 
riophage A heads in vitro. Reconstituted pLFR-5 without inserted DNA is 
not packaged. After the assembly of bacteriophage particles, the DNA is 
delivered by infection into E. coli (Fig. 3.32). Once inside the host cell, the 
cos ends, which were cleaved during the in vitro packaging, base pair and 
enable the linear DNA to circularize. This circular form is stable, and the 
cloned DNA is maintained as a plasmid-insert DNA construct because the 
vector DNA contains a complete set of plasmid functions. Moreover, the 
Tet r gene allows colonies that carry the cosmid to grow in the presence of 
tetracycline. Nontransformed cells are sensitive to tetracycline and die. 

A fosmid is a kind of cosmid vector that carries up to 40 kb of insert 
DNA and a cos site for in vitro bacteriophage A packaging. The difference 
between a cosmid and a fosmid is that the origin of replication of a fosmid 
is derived from the E. coli F factor (sex plasmid), hence the name. The 
advantage of fosmids is that they are very stable single-copy vectors, 
whereas cosmids are maintained at higher copy numbers, which often 
leads to deletions or rearrangements of parts of the insert DNA. 

Other E. coli bacteriophages have been used for creating vectors. For 
example, the genome of the PI bacteriophage is 115 kb long. The PI vector 
system can carry 80 to 100 kb of inserted DNA. The advantages of using 
cosmids and other vectors derived from bacteriophages are twofold. First, 
because the capacity of these vectors is greater than that of plasmids, gene 
clusters and large genes can be cloned. Second, a larger insert in the cloning 
vehicle means that fewer clones of a genomic library have to be screened 
for a specific gene. 
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FIGURE 3.32 A cosmid cloning system. The 
cosmid contains an E. coli origin of replica¬ 
tion (ori) that allows the cosmid to be 
maintained as a plasmid in E. coli; two 
intact cos sites closely flanking a unique 
Seal site; a unique BamHI site near, but 
outside, one of the cos sites; and a Tet r 
gene. The source DNA is cut with BamHI 
and fractionated by size to isolate mole¬ 
cules that are about 45 kb long. The plasmid 
DNA is cut with Seal and BamHI. The two 
DNA samples are mixed and treated with 
T4 DNA ligase. After ligation, some of the 
joined DNA molecules will have a 45-kb 
piece of DNA inserted into the BamHI site 
of the plasmid; when this happens, the two 
cos sequences are about 50 kb apart. These 
molecules are packaged into bacteriophage 
X heads in vitro, and infective particles are 
formed after the addition of tail assem¬ 
blies. Infective bacteriophage X delivers a 
linearized DNA molecule with cos exten¬ 
sions into E. coli. After entry into the host 
cell, the cos ends base pair and the DNA 
ligase of the host cell seals the nicks. The 
circular DNA molecule that is created in 
this way persists as a plasmid in the host 
cell. In this case, transformed cells can be 
identified because they are resistant to the 
antibiotic tetracycline. 
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TABLE 3.8 Estimated numbers of clones required for 99% probability of representation of every DNA region in a ge¬ 
nomic library for various organisms and cloning vectors 


Organism 

Genome size (bp) 


No. of clones 


pBR322 (5 kb) 

Bacteriophage X (17 kb) 

Cosmid (35 kb) 

BAC (150 kb) 

E. coli 

4.6 x 10 6 

4,234 

1,243 

602 

138 

Yeast 

1.23 x 10 7 

11,326 

3,329 

1,616 

375 

Fruit fly 

1.2 x 10 s 

110,529 

32,501 

15,787 

3,681 

Rice 

5.7 x 10 s 

525,589 

154,521 

75,009 

17,497 

Human 

3.3 x 10 9 

3,090,475 

898,392 

434,053 

101,258 

Frog 

2.3 x 10 w 

19,315,480 

6,438,493 

2,971,610 

708,822 


For the cloning vectors, the sizes in parentheses are the average sizes of the insert DNA. 


High-Capacity Bacterial Vector Systems 

A vector system that carries very large inserts (>100 kb) is helpful for the 
analysis of complex eukaryotic genomes. For example, these types of 
vector systems are indispensable for creating libraries for genome 
sequencing and for carrying one or more intact genes on a single insert. In 
contrast to a small-insert library, a large-insert genomic library is more 
likely to include all of the genetic material of the organism with fewer 
clones to maintain (Table 3.8). A low-copy-number E. coli plasmid vector 
that is based on the PI bacteriophage cloning system has been devised for 
cloning DNA molecules that are from 100 to 300 kb in length. The DNA 
insert-vector constructs of this system are called bacteriophage Pl-derived 
artificial chromosomes. Similarly, the F plasmid (F-factor replicon, sex 
plasmid, or fertility plasmid) of E. coli, which is present at one or two copies 
per cell, has been used, along with the lacZ' selection system of the pUC 
vectors, to construct an extremely stable cloning vector that carries DNA 
inserts from 50 to 300 kb in length. The F-plasmid-based DNA insert-vector 
constructs, which are used extensively, are called BACs. 


Genetic Transformation of Prokaryotes 

Transferring DNA into £ coli 

Transformation is the process of introducing free DNA into a bacterial cell. 
For E. coli, which is the main host cell for recombinant DNA research, the 
uptake of plasmid DNA is usually achieved by treating mid-log-phase cells 
with ice-cold calcium chloride (CaCl 2 ) and then exposing them for 2 min¬ 
utes to a high temperature (42°C). This treatment creates transient openings 
in the cell wall that enable DNA molecules to enter the cytoplasm. This 
method has a maximum transformation frequency of about 1 transformed 
cell per 1,000 cells (i.e., 10 3 ). The transformation efficiency is approxi¬ 
mately 10 7 to 10 s transformed colonies per microgram of intact plasmid 
DNA. Although a 100% transformation frequency would be ideal, selection 
schemes that enable plasmid-transformed cells to be readily identified 
overcome the drawback of a low transformation frequency. In some other 
bacteria, competence occurs naturally and, in some cases, can be enhanced 
by the use of specific growth media or growth conditions. These bacteria 
are usually easily transformed. Other DNA delivery systems must be used 
for bacteria that are either refractory to chemically induced competence or 
are not naturally competent. 
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Electroporation 

The uptake of free DNA can be induced by subjecting bacteria to a high- 
voltage electric field in the presence of DNA. This procedure is called elec¬ 
troporation, a term that is a contraction of the descriptive phrase 
"electric-field-mediated membrane permeabilization." The experimental 
protocols for electroporation are different for various bacterial species. For 
E. coli, the cells (~50 microliters) and DNA are placed in a chamber fitted 
with electrodes (Fig. 3.33A), and a single pulse of approximately 25 micro¬ 
farads, 2.5 kilovolts, and 200 ohms is administered for about 4.6 millisec¬ 
onds. This treatment yields transformation efficiencies of 10 9 transformants 
per microgram of DNA for small plasmids (~3 kb) and 10 6 for large plas¬ 
mids (-136 kb). Similar conditions are used to introduce BAC vector DNA 
into E. coli. Thus, electroporation is an effective way to transform £. coli 
with plasmids containing inserts that are longer than 100 kb. Because an 
appropriate set of electroporation conditions can be found for nearly all 
bacterial species, this procedure has become standard for transforming 
many different types of bacteria. 

Very little is known about the mechanism of DNA uptake during elec¬ 
troporation (Fig. 3.33B). It has been deduced, along the lines of the explana¬ 
tion of chemically induced transformation, that transient pores are formed 
in the cell wall as a result of the electroshock and that, after contact with the 
lipid bilayer of the cell membrane, the DNA is taken into the cell. 


FIGURE 3.33 Electroporation. (A) Electroporation cuvette with a cell suspension 
between two electrodes. (B) (1) Cells (yellow) and DNA (red) in suspension in an 
electroporation cuvette prior to the administration of high-voltage electric field 
(HVEF) pulses. (2) FIVEF pulses induce openings in the cells (dashed lines) that 
allow entry of DNA into the cells. (3) After HVEF pulsing, some cells acquire exog¬ 
enous DNA, and the HVEF-induced openings are resealed. 
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CHAPTER 3 


Conjugation 

For some bacteria, the natural system of transmitting plasmids from one 
strain to another has been used to transport a plasmid-insert DNA con¬ 
struct from a donor cell to a recipient cell that is not readily transformed. 
Some plasmids are genetically equipped to form cell-to-cell junctions 
through which plasmid DNA is transferred from one cell to another. 
Effective contact between a donor cell and a recipient cell depends on 
plasmid genes that encode conjugative functions. Moreover, the mechan¬ 
ical transfer of the DNA requires plasmid genes that encode mobilizing 
functions. Most of the plasmids that are used for recombinant DNA 
research lack conjugative functions, and therefore, they are not passed to 
recipient cells by conjugation. However, some nonconjugative plasmid 
cloning vectors can be mobilized and transferred if the conjugative func¬ 
tions are supplied by a second plasmid in the same cell. In other words, by 
introducing a plasmid with conjugative functions into a bacterial cell that 
carries a mobilizable plasmid cloning vector, it is possible to transfer the 
plasmid cloning vector to a recipient cell that is difficult to transform by 
other means. 

The typical experimental protocol for this procedure entails mixing three 
strains together. When the cells are close to each other, the conjugative 
plasmid, which in this case is also mobilizable, can be self-transferred to the 
cell with the mobilizable plasmid cloning vector. Then, with the help of the 
conjugative plasmid, the plasmid cloning vector is transferred to a targeted 
recipient cell. All possible combinations of plasmid transfer occur among the 


FIGURE 3.34 Tripartite mating. A helper cell self-transfers a conjugative, mobilizing 
plasmid with a Tet r gene to a donor cell. The plasmid from the helper cell provides 
conjugative functions for the nonconjugative, mobilizable plasmid of the donor cell, 
which carries a kanamycin resistance (Kan r ) gene and enables transport of the latter 
plasmid into the recipient cell. Unlike the recipient cells, neither helper nor donor 
cells can grow on minimal medium, and they are resistant to kanamycin and sensi¬ 
tive to tetracycline. The selection strategy identifies cells that are able to grow on 
minimal medium. In this example, the cloning vector is transferred from E. coli to 
P. putida. Although it is not shown, each plasmid has an origin of replication. The 
plasmid from the donor cell replicates in the recipient cell. 
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cells, but the genetic features of the strains and plasmids are designed to 
select for the targeted recipient cells that receive the cloning vector. For 
example, one possible selection procedure uses a helper cell (E. coli) that 
maintains a conjugative, mobilizable plasmid with a Teh gene and cannot 
grow on minimal growth medium; a donor cell (E. coli) that also cannot grow 
on minimal growth medium and carries the nonconjugative, mobilizable 
plasmid cloning vector that has a kanamycin resistance gene; and a recipient 
cell (Pseudomonas putida) that can grow on minimal growth medium, has no 
incompatible plasmid, and is sensitive to both tetracycline and kanamycin 
(Fig. 3.34). After the conjugations are allowed to occur, the cells are grown 
briefly in complete growth medium in the absence of antibiotics before being 
transferred to minimal growth medium with kanamycin. Under the latter 
growth conditions, only the targeted recipient cells that have acquired the 
plasmid cloning vector can grow. Neither helper nor donor cells can grow on 
minimal medium, and the recipient cells that did not receive a plasmid from 
a donor cell cannot grow in the presence of kanamycin. Occasionally, the 
targeted recipient cell may receive both types of plasmids. However, this rare 
event can be detected by replica plating onto minimal medium with both 
tetracycline and kanamycin. Colonies that are formed in the presence of both 
antibiotics acquire two different plasmids, and those that grow only when 
kanamycin is present have the cloning vector. Since the transfer of plasmid 
DNA requires conjugation among three bacterial strains, the process has 
been designated tripartite mating. 


SUMMARY 


R ecombinant DNA technology comprises a battery of 
experimental procedures that are used for inserting DNA 
segments from one organism into a vector, often a bacterial 
plasmid, and perpetuating the insert DNA-vector DNA com¬ 
bination in a host cell. Large amounts of the insert (cloned) 
DNA can be retrieved when required. The process would not 
be possible without type II restriction endonucleases that 
cleave DNA molecules reproducibly into fragments of discrete 
sizes. These enzymes bind to specific sequences within a DNA 
molecule and symmetrically cut phosphodiester bonds of 
each strand at the recognition site. In addition, many other 
enzymes, such as T4 DNA ligase and DNA polymerase, are 
important for cloning genes. 

A representative gene-cloning experiment has a number of 
steps. (1) DNA is isolated from an organism that contains the 
target gene and is cut with a restriction endonuclease. (2) A 
DNA cloning vector is cut with the same restriction endonu¬ 
clease used to digest the source DNA. The cloning vector has 
only one of these restriction endonuclease sites. (3) The two 
DNA samples are mixed with T4 DNA ligase, and various 
combinations of DNA molecules, including vector DNA and 
DNA from the source organism, are generated by the enzy¬ 
matic formation of phosphodiester bonds at the ends of DNA 
strands. (4) Host cells, usually E. coli, are transformed with 
DNA molecules from the ligase reaction, which produces 
some cells that carry vector-insert DNA constructs. Because 
the vector has a DNA sequence (origin of replication) that 
enables it to be replicated in the host cell, the entire construct 
is perpetuated. Uptake of DNA by E. coli is facilitated by 


CaCl 2 -heat shock treatment, electroporation, or other means. 
Conjugative and mobilization functions of plasmids are used 
in some cases to transfer a plasmid-gene construct to a bacte¬ 
rium that is not readily transformed. 

Finally, selection schemes are available for identifying cells 
with vector-insert DNA constructs. Transformed cells are dis¬ 
tinguished from nontransformed cells by testing for resistance 
to specific antibiotics or by observing specifically colored colo¬ 
nies. Cells with a specific cloned target gene are identified by 
DNA hybridization with a homologous or heterologous 
probe, by immunological determination of an encoded recom¬ 
binant protein, by the presence of a specified enzyme activity, 
or by functional (genetic) complementation. The probability of 
cloning a complete gene is increased by partially digesting the 
source DNA with a restriction endonuclease and forming a 
library of clones consisting of overlapping sequences of an 
entire genome. Vectors based on bacteriophage X, bacterio¬ 
phage PI, Pl-derived plasmid (PI artificial chromosome), and 
the F plasmid (BAC) have been developed for carrying large 
pieces of DNA and constructing genomic libraries from both 
prokaryotic and eukaryotic organisms. 

To obtain DNA segments that encode eukaryotic proteins, 
purified mRNA is used as a template for the enzyme reverse 
transcriptase to synthesize a cDNA strand; in turn, this strand 
acts as a template for DNA polymerase to produce a second 
cDNA strand. After enzymatic treatment, the double-stranded 
cDNA can be cloned into a vector. Multistep strategies have 
been devised to increase the likelihood that only full-length 
cDNA molecules are synthesized and cloned. 
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REVIEW QUESTIONS 


1. What are type II restriction endonucleases? Why are they 
important for recombinant DNA technology? 

2. When circular double-stranded DNA from the plasmid 
pCELl is digested by various restriction endonucleases and 
combinations thereof, the following bands (with lengths given 
in kilobase pairs) are observed: EcoRI, 6.0; BamHI, 6.0; Hindlll, 
6.0; Haell, 3.0, 2.0, and 1.0; EcoRI and Haell, 2.0 and 1.0; EcoRI 
and Hindlll, 3.5 and 2.5; EcoRI and BamHI, 4.5 and 1.5; 
BamHI and Hindlll, 5.0 and 1.0; BamHI and Haell, 3.0,1.5,1.0, 
and 0.5; and Hindlll and Haell, 3.0,1.5,1.0, and 0.5. Construct 
a restriction endonuclease enzyme site map with this informa¬ 
tion. 

3. Describe how plasmid pBR322 is used as a cloning vector. 
What are its special features? 

4. Describe the principal features of the pUC cloning system. 

5. A genomic library of a prokaryotic organism is often con¬ 
structed by cloning the products of a Sau3AI partial digest of 
the genomic DNA into a BamHI site of the vector. 

a. Why are two different enzymes used in this experi¬ 
ment? 

b. What is a partial digestion, and how is it performed? 

c. Why is partial digestion often used for constructing 
genomic libraries? 


6. For gene-cloning experiments, why is the cleaved plasmid 
DNA often treated with alkaline phosphatase prior to the liga¬ 
tion step? 

7. Suggest a few different ways that recombinant plasmids 
can be introduced into a gram-negative bacterium, such as E. 
coli. 

8. Outline some of the strategies that are used to detect a 
cloned target gene within a library in E. coli. What conditions 
must be satisfied for each type of assay? 

9. What is a cDNA library? 

10. Why would you use a plasmid, bacteriophage X, cosmid, 
or BAC as a cloning vector? 

11. What is a common mode of action of a type IIS restriction 
endonuclease? 

12. What is replica plating? Outline the basic methodology for 
replica plating. 

13. What is a fosmid? 

14. What is a tripartite mating? 

15. What is the role of RNase H during cDNA synthesis? 
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Chemical Synthesis, 
Amplification, and 
Sequencing of DNA 


T echnological advances in any area of science have profound 
effects on research. New protocols spawn novel experiments, and 
laboratory procedures that were at one time difficult to implement 
become much easier to perform. The very essence of molecular biotech¬ 
nology is rooted in a wide range of technical developments, many of which 
have become commonplace and accessible to both large and small research 
facilities. For example, it is now standard to chemically synthesize a DNA 
molecule, amplify DNA using the polymerase chain reaction (PCR), and 
obtain the nucleotide sequence of DNA. Each of these procedures is derived 
from basic studies of the structure of DNA and the mechanism of DNA 
replication. Moreover, these experimental methods are essential for iso¬ 
lating, characterizing, and expressing cloned genes. 


Chemical Synthesis of DNA 

The ability to chemically synthesize a strand of DNA with a specific 
sequence of nucleotides easily, inexpensively, and rapidly has contributed 
significantly to the methodologies of molecular cloning and DNA charac¬ 
terization. Chemically synthesized single-stranded DNA oligonucleotides 
are used for assembling whole genes, amplifying specific DNA sequences, 
introducing mutations into cloned genes, screening gene libraries, 
sequencing DNA, and facilitating gene cloning. 

Machines that automate the chemical reactions for DNA synthesis 
(DNA synthesizers, or "gene machines") have made the production of 
single-stranded oligonucleotides (<50 nucleotides) into a more or less rou¬ 
tine procedure. Generally, DNA synthesizers consist of a set of valves and 
pumps that are programmed to introduce, in the correct order, specified 
nucleotides and the reagents required for the coupling of each consecutive 
nucleotide to the growing chain. Chemical DNA synthesis does not follow 
the biological direction of DNA synthesis; rather, during the chemical pro¬ 
cess, each incoming nucleotide is coupled to the 5' hydroxyl terminus of the 
growing chain. All the reactions are carried out in succession in a single 
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reaction column, and both the duration of each reaction and the washing 
steps are computer controlled. 

The Phosphoramidite Method 

Currently, the phosphoramidite method of chemical DNA synthesis is the 
procedure of choice. Before their introduction into the reaction column, the 
amino groups of the bases adenine, guanine, and cytosine are derivatized 
by the addition of benzoyl, isobutyryl, and benzoyl groups, respectively, to 
prevent undesirable side reactions during chain growth. Thymine is not 
treated because it lacks an amino group. Solid-phase synthesis, i.e., attach¬ 
ment of the growing DNA strand to a solid support, is used so that all the 


FIGURE 4.1 Flowchart for the chemical synthesis of DNA oligonucleotides. After n 
coupling reactions (cycles), a single-stranded piece of DNA with n + 1 nucleotides 
is produced. 
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FIGURE 4.2 Starting complex for the 
chemical synthesis of a DNA strand. 
The initial nucleoside has a protective 
DMT group attached to the 5' hydroxyl 
group of the deoxyribose moiety and a 
spacer molecule attached to the 
hydroxyl group of the 3' carbon of the 
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reactions can be conducted in one reaction vessel, the reagents from one 
reaction step can be readily washed away before the reagents for the next 
step are added, and the reagents can be used in excess in an attempt to 
drive the reactions to completion. 

The chemical synthesis of DNA is a multistep process (Fig. 4.1). The 
initial nucleoside (base and sugar only), which will be the 3'-terminal 
nucleotide of the synthesized strand, is attached to a spacer molecule by its 
3' hydroxyl terminus, and the spacer molecule is covalently attached to an 
inert support, which is often a controlled-pore glass (CPG) bead (a glass 
bead with uniformly sized pores) (Fig. 4.2). A dimethoxytrityl (DMT) 
group is attached to the 5' terminus of the first nucleoside to prevent the 5' 
hydroxyl group from reacting nonspecifically before the addition of the 
second nucleotide. Each nucleotide that is added to the growing chain has 
a 5' DMT protective group and also a diisopropylamine group attached to 
a 3' phosphite group that is protected by a (3-cyanoethyl (CH 2 CFI 2 CN) 
group (Fig. 4.3). This molecular assembly is called a phosphoramidite. 

After the first nucleoside is bound to the CPG beads, the cycle begins. 
First, the reaction column is washed extensively with an anhydrous 
reagent, e.g., acetonitrile, to remove water and any nucleophiles that may 
be present. The column is flushed with argon to purge the acetonitrile. 
Next, the 5' DMT group is removed from the attached nucleoside by treat¬ 
ment with trichloroacetic acid (TCA) to yield a reactive 5' hydroxyl group 
(Fig. 4.4). After this detritylation step, the reaction column is washed with 
acetonitrile to remove the TCA and then with argon to remove the ace¬ 
tonitrile. The machine is programmed to introduce the next prescribed base 
(phosphoramidite) and tetrazole simultaneously for the activation and 
coupling steps. The tetrazole activates the phosphoramidite so that its 3' 
phosphite forms a covalent bond with the 5' hydroxyl group of the initial 
nucleoside (Fig. 4.5). Unincorporated phosphoramidite and tetrazole are 
removed by flushing the column with argon. 

Because not all of the support-bound nucleosides are linked to a phos¬ 
phoramidite during the first coupling reaction, the unlinked residues must 
be prevented from linking to the next nucleotide during the following 
cycle. To do this, acetic anhydride and dimethylaminopyridine are added 
to acetylate the unreacted 5' hydroxyl groups (Fig. 4.6). If this capping step 


FIGURE 4.3 Structure of a phosphoramidite. Phosphoramidites are available for each 
of the four bases (A, C, G, and T) that are used for the chemical synthesis of a DNA 
strand. A diisopropylamine group is attached to the 3' phosphite group of the 
nucleoside. A (3-cyanoethyl group protects the 3' phosphite group, and a DMT 
group is bound to the 5' hydroxyl group of the deoxyribose sugar. 
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FIGURE 4.5 Activation and coupling. The activation of a phosphoramidite enables 
its 3' phosphite group to attach to the 5' hydroxyl group of the bound detritylated 
nucleoside. 
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FIGURE 4.6 Capping. The available 5' hydroxyl groups of unreacted detritylated 
nucleosides are acetylated to prevent them from participating in the coupling reac¬ 
tion of the next cycle. 


is not carried out, then, after a number of cycles, the growing chains will 
differ in both length and nucleotide sequence. 

At this stage of the process, the linkage between the nucleotides is in 
the form of a phosphite triester bond, which is unstable and prone to 
breakage in the presence of either acid or base. Therefore, the phosphite 
triester is oxidized with an iodine mixture to form the more stable pentava- 
lent phosphate triester (Fig. 4.7). After this oxidation step and a subsequent 
wash of the reaction column, the cycle of detritylation, phosphoramidite 
activation, coupling, capping, and oxidation is repeated (Fig. 4.1). This 


FIGURE 4.7 Oxidation. The phosphite triester internucleotide linkage is oxidized to 
the pentavalent phosphate triester. This reaction stabilizes the phosphodiester bond 
and makes it less susceptible to cleavage under either acidic or basic conditions. 
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cycling continues with each successive phosphoramidite until the last pro¬ 
grammed residue has been added to the growing chain. When the final 
cycle is completed, the newly synthesized DNA strands are bound to the 
CPG beads; each phosphate triester contains a (3-cyanoethyl group; every 
guanine, cytosine, and adenine carries its amino-protecting group; and the 
5' terminus of the last nucleotide has a DMT group. 

The (3-cya noethyl groups are removed by a chemical treatment in the 
reaction column. The DNA strands are then cleaved from the spacer mol¬ 
ecule, leaving a 3' hydroxyl terminus. The DNA is eluted from the reaction 
column, and in succession, the benzoyl and isobutyryl groups are stripped 
away and the DNA is detritylated. The 5' terminus of the DNA strand is 
phosphorylated either by a T4 polynucleotide kinase reaction or by a 
chemical procedure. Phosphorylation can also be carried out after detrity- 
lation while the oligonucleotide is still bound to the support. 

To achieve a reasonable overall yield of an oligonucleotide, the coupling 
efficiency should be greater than 98% at each step. The coupling efficiency 
of each cycle is determined by spectrophotometrically monitoring released 
trityl groups. If, for example, the efficiency is 99% at each cycle during the 
production of a 20-unit oligonucleotide (20-mer), which entails 19 coupling 
reactions, since the first base is bound to the spacer and is not involved in a 
coupling step, then 83% (i.e., 0.99 19 x 100) of the product will be 20 nucle¬ 
otides long. If a 60-mer is synthesized with 99% efficiency at each cycle, then 
about 55% of the final product will contain all 60 of the nucleotides. With an 
average coupling efficiency consistently less than 98%, the yield of full- 
length oligonucleotides diminishes as a function of the required number of 
cycles (Table 4.1). The coupling efficiency for most commercial DNA synthe¬ 
sizers averages about 99.5% for each step. However, depending on the 
length and stringency of the end use of an oligonucleotide, it may be neces¬ 
sary to purify the final product using either reverse-phase high-pressure 
liquid chromatography or gel electrophoresis. These methods separate the 
longer target oligonucleotides from the shorter "failure" sequences. 

Uses of Synthesized Oligonucleotides 

Chemically synthesized oligonucleotides (<100-mers) have a myriad of 
functions. Single-stranded hybridization oligonucleotide probes (20- to 
40-mers) can be formulated by deducing the codons from the amino acid 
sequence of a protein and then used to screen a genomic library for the 
gene (Fig. 4.8A). Since the actual codons representing a conserved amino 
acid sequence are unknown because of codon redundancy, especially at the 
third position, a single arbitrary synthetic probe may not contain sufficient 
complementary bases (matches) to produce significant hybridization with 


TABLE 4.1 Overall yields of chemically synthesized oligonucleotides with 
different coupling efficiencies 


Coupling 


Overall yield of oligonucleotide (%) 


efficiency (%) 

20-mer 

40-mer 

60-mer 

80-mer 

100-mer 

90 

14 

1.6 

0.2 

0.02 

0.003 

95 

38 

14 

4.8 

1.7 

0.6 

98 

68 

46 

30 

20 

14 

99 

83 

68 

55 

45 

37 

99.5 

91 

82 

74 

67 

61 
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a heterologous sequence. For this reason, a set of mixed probes is often 
used to screen a genomic library. The formation of a sample of mixed 
probes is straightforward. Briefly, during chemical DNA synthesis, instead 
of providing a specific phosphoramidite for a particular nucleotide site, a 
mixture of different bases is added to the reaction. For example, with the 
addition of equal concentrations of four different bases for one nucleotide 
position, four different probes are produced. If two sites are treated this 
way, 16 (4 2 ) different probes will be synthesized (Fig. 4.8B); for n sites, there 
are 4" different oligonucleotides. Moreover, the frequencies of various 
probes in the mixture can be skewed by varying the proportions of bases in 
the reaction mixture for specific sites. As a consequence, in contrast to a 
single probe, a set of mixed probes is likely to contain sequences that are 
highly complementary to a heterologous gene. As discussed below, single- 
stranded oligonucleotides (~17- to 24-mers, or primers) are also required 
for DNA sequencing and PCR. In some cases, additional sequences are 
added to the PCR primers as part of the synthesis process to create mole¬ 
cules with restriction endonuclease sites for cloning or sequences that con¬ 
tain regulatory elements for transcription and translation of the amplified 
DNA after it is inserted into a vector. 

Oligonucleotides are extremely useful for both cloning DNA molecules 
and creating unique cloning sites. For example, linkers are short, double- 
stranded, blunt-ended DNA molecules with self-complementary (palin- 


FIGURE 4.8 (A) All possible DNA sequences deduced from a protein sequence. For 
simplicity, only 3 amino acids, 2 of which have twofold degeneracy and 1 that has 
a single codon, are shown. (B) Formation of sets of mixed probes. A question mark 
denotes a site where equal concentrations of A, C, G, and T are supplied during 
chemical synthesis. The remaining sites have the same deoxynucleotide (C). Four 
and 16 different oligonucleotides are produced when there is a 25% probability that 
any 1 of the 4 nucleotides will be incorporated at one or two sites, respectively. 
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dromic) strands that contain a restriction endonuclease site (Fig. 4.9A and 
B). Typically, after complementary single-stranded oligonucleotides are 
annealed, the resulting double-stranded linker is blunt-end ligated to DNA 
molecules that are unclonable because they lack a suitable restriction endo¬ 
nuclease site. In practice, during the ligation reaction, the linker molecules 
are successively added to the ends of the source DNA and to each other. 
Then, after treatment with the restriction endonuclease corresponding to 
the linker sequence, the large DNA fragments, i.e., the source DNA with 
restriction endonuclease extensions, are separated from the remnants of 
linker DNA and cloned into the comparable site of a cloning vector (Fig. 
4.10). 

Adaptors are variants of linkers that are often used to create novel 
cloning sites in vectors. They are short double-stranded DNA molecules 
that contain one or more restriction endonuclease sites and may have either 
one blunt end and one extended end or two extended ends. For example, 
the BamHI-Smal adaptor (Fig. 4.9C) is inserted into a unique BamHI site 
of a cloning vector to create a novel Smal site. The DNA fragment that is to 
be cloned can be blunt-end ligated into the Smal site, and after selection, 
the insert can be retrieved by treating the construct with BamHI (Fig. 4.11). 
For this procedure to be effective, the cloned DNA fragment must not have 
the same restriction endonuclease sites as the linker sequence; otherwise, 
the insert will be cut during restriction endonuclease treatment. 

Oligonucleotides are the key components for assembling genes. There 
are a number of applications for synthetic genes, including large-scale pro¬ 
duction of proteins, testing protein function after changing specific codons, 
and creating nucleotide sequences that encode proteins with novel proper¬ 
ties. The production of short genes (60 to 80 base pairs [bp]) is technically 
straightforward and can be accomplished by synthesizing the complemen¬ 
tary strands and then annealing them. For the production of longer genes 
(>300 bp), however, special strategies must be devised because the cou¬ 
pling efficiency of each cycle during chemical DNA synthesis is never 
100%. For example, if a gene contains 1,000 bp and the average coupling 
efficiency is 99.5%, then the proportion of full-length single DNA strands 
after the last cycle is a minuscule 0.007%. To overcome this problem, syn¬ 
thetic (double-stranded) genes are assembled in modular fashion from 
oligonucleotides that are about 60 nucleotides in length. 

One method for building a synthetic gene requires the initial produc¬ 
tion of a set of overlapping, complementary oligonucleotides, each of 
which is usually between 20 and 60 nucleotides long. Each internal section 
of the gene is made up of a set of oligonucleotides with complementary 
3'- and 5'-terminal extensions that are designed to base pair precisely with 
a different oligonucleotide that has complementary terminal extensions 
(Fig. 4.12). The oligonucleotides that make up the two ends of a gene are 
aligned to give blunt ends. Thus, after the gene is assembled, the only 
remaining requirement to complete the process is sealing the nicks along 
the backbones of the two strands with T4 DNA ligase. In addition to the 
protein coding sequence, synthetic genes can be designed with restriction 
endonuclease sites at their ends to facilitate insertion into a cloning vector 
and, if necessary, with additional sequences that contain signals for the 
proper initiation and termination of transcription and translation. To opti¬ 
mize translation, the codons of a gene from one organism can be changed 
to those that are preferred by the host cell without altering the amino acid 
sequence. 
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FIGURE 4.9 Typical linker and adaptor 
sequences. (A) A 6 -mer EcoRI linker; 
(B) an 8 -mer EcoRI linker; (C) a BamHI- 
Smal adaptor. Both the BamHI cohe¬ 
sive ends and the Smal recognition 
sequence are indicated. 
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FIGURE 4.10 Cloning with a linker. A palindromic oligonucleotide (red) with a 
BamHI recognition site base pairs to itself to form a BamHI linker. During the liga¬ 
tion reaction, linkers are successively attached to the source DNA and to each other. 
The ligation mixture is treated with BamHI. The large DNA molecules are purified 
and cloned into the BamHI site of a cloning vector. DNA molecules without linkers 
are not cloned. 
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Another way to prepare a full-size gene is to synthesize a specified set 
of overlapping oligonucleotides that are about 60 nucleotides in length 
with approximately 20-base overlaps (Fig. 4.13). After the 3' and 5' exten¬ 
sions are annealed, large gaps still remain, but the base-paired regions are 
both long enough and stable enough to hold the structure together. After 
all the oligonucleotides are combined, the gaps are filled by enzymatic 
DNA synthesis with Escherichia coli DNA polymerase I. This enzyme uses 
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FIGURE 4.11 Creating a restriction endonuclease site in a vector with an adaptor. 
After self-hybridization, an adaptor molecule (red) with two BamHI 5' extensions 
and a Smal site (BamHI-Smal adaptor) is formed. The BamHI-Smal adaptor is 
inserted into a unique BamHI site of a vector to create a unique Smal site. DNA 
(blue) is cloned into the Smal site by blunt-end ligation. Although the Smal site is 
destroyed by the insertion of a DNA molecule, the insert can be retrieved by cutting 
the vector with BamHI. 
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FIGURE 4.12 Assembly of a synthetic gene from short oligonucleotides. Individual 
oligonucleotides (20- to 60-mers) are synthesized chemically. Their sequences are 
designed so that they form a double-stranded array after annealing. T4 DNA ligase 
is used to seal the nicks and produce an intact version of the gene. 


the 3' hydroxyl groups as replication initiation points and the single- 
stranded regions as templates. After the enzymatic synthesis is completed, 
the nicks are sealed with T4 DNA ligase. For larger genes (>1,000 bp), 
smaller sections of the gene are first assembled into units (chunks, or syn- 
thons) about 500 bases in length that are then combined with other 500- 
base units, and in turn, these larger kilobase segments are joined together 
until the entire sequence is completed. Computer programs (e.g.. Gene 
Design) that make it easier to determine the best set of oligonucleotides and 
overlaps for gene construction, as well as allow the user to select a partic¬ 
ular codon usage, change any codon, and designate restriction endonu¬ 
clease sites at specific locations, are available both commercially and freely 
on the Internet. Finally, it is absolutely essential that a chemically synthe¬ 
sized gene have the correct sequence of nucleotides. Consequently, small 
synthetic genes are sequenced directly, and for larger genes, the sequence 
of each of the 500-base building blocks is determined. 


Polymerase Chain Reaction 

PCR is an effective procedure for generating large quantities of a specific 
DNA sequence in vitro. This amplification, which can be more than a mil¬ 
lionfold, is achieved by a three-step cycling process. The essential compo- 
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FIGURE 4.13 Assembly and in vitro enzymatic DNA synthesis of a gene. Individual 
oligonucleotides are synthesized chemically. Their sequences are designed to 
enable them to form a stable molecule, with base-paired regions separated by open 
regions (gaps). The gaps are filled in by in vitro enzymatic DNA synthesis. The 
nicks are sealed with T4 DNA ligase. 


nents for PCR amplification are (1) two synthetic oligonucleotide primers 
(~20 nucleotides each) that are complementary to regions on opposite 
strands that flank the target DNA sequence and that, after annealing to the 
source DNA, have their 3' hydroxyl ends oriented toward each other; (2) a 
template sequence in a DNA sample that lies between the primer-binding 
sites and that can be from 100 to -35,000 bp in length; (3) a thermostable 
DNA polymerase that can withstand being heated to 95°C or higher and 
that copies the DNA template with high fidelity; and (4) the four deoxyri- 
bonucleotides. 

A typical PCR process entails a number of cycles for amplifying a spe¬ 
cific DNA sequence. Each cycle has three successive steps. 
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CHAPTER 4 


1. Denaturation. The first step in the PCR amplification system is the 
thermal denaturation of the DNA sample by raising the tempera¬ 
ture within a reaction tube to 95°C. In addition to the source tem¬ 
plate DNA, this reaction tube contains a vast molar excess of the 
two oligonucleotide primers, a thermostable DNA polymerase 
(e.g., Taq DNA polymerase, isolated from the bacterium Thermus 
aquations), and four deoxyribonucleotides. The temperature is 
maintained for about 1 minute. 

2. Renaturation. For the second step, the temperature of the mixture 
is slowly lowered to ~55°C. During this step, the primers base pair 
with their complementary sequences in the DNA sample. 

3. Synthesis. In the third step, the temperature is raised to ~75°C, 
which is optimum for the catalytic functioning of Taq DNA poly¬ 
merase. DNA synthesis is initiated at the 3' hydroxyl end of each 
primer and uses the source DNA as a template (Fig. 4.14). 

All steps in a PCR cycle are carried out in an automated block heater 
that is programmed to change temperatures after a specified period of 
time. One cycle generally lasts from 3 to 5 minutes. 

To understand how the PCR protocol succeeds in amplifying a discrete 
segment of DNA, it is important to keep in mind the location of each 
primer-annealing site and its complementary sequence within the strands 
that are synthesized during each cycle. During the synthesis phase of the 
first cycle, the newly synthesized DNA from each primer is extended 
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DNA Sequencing with Chain-Terminating Inhibitors 

F. Sanger, S. Nicklen, and A. R. Coulson 
Proc. Natl. Acad. Sci. USA 74:5463-5467, 1977 


N ew techniques are the life¬ 
blood of science. They enable 
researchers to acquire infor¬ 
mation that was previously inacces¬ 
sible, and that, in turn, generates 
insights that stimulate new research 
and lead to new discoveries. For 
molecular biotechnology, DNA 
sequencing and PCR are powerful 
procedures that have become labora¬ 
tory mainstays. 

DNA sequencing by enzymatic 
synthesis with chain elongation inhibi¬ 
tors is a relatively simple, accurate, 
and reliable method. The most defini¬ 
tive form of molecular characteriza¬ 
tion of a cloned piece of DNA is its 
sequence. Among other things, the 
coding content of an insert, potential 
primer sequences for a PCR, and the 
presence of a gene mutation can be 


determined by DNA sequencing. At 
the time the Sanger (dideoxy) method 
was published, most DNA sequencing 
was carried out by the base-specific 
chemical-cleavage method devised by 
A. M. Maxam and W. Gilbert (Proc. 
Natl. Acad. Sci. USA 74:560-564,1977). 
Before the development of these tech¬ 
niques, nucleic acid sequencing was 
more or less limited to RNA mole¬ 
cules. The sequencing of a DNA mole¬ 
cule required transcribing a DNA 
fragment into RNA with RNA poly¬ 
merase and then sequencing the RNA 
product. In general, RNA sequencing 
entailed treating a radiolabeled RNA 
molecule with different ribonucleases, 
chromatographically separating the 
digestion products, redigesting the 
separated products, hydrolyzing the 
products of the second digestion with 


alkali, chromatographically separating 
the hydrolysis products, determining 
the sequence of the oligonucleotides, 
and constructing the sequence based 
on overlapping stretches of nucle¬ 
otides. This approach was time-con¬ 
suming and tedious. With the advent 
of the dideoxy method, it became 
obsolete. Now, RNA molecules are 
usually sequenced by converting them 
into DNA molecules. The Sanger 
method superseded the Maxam and 
Gilbert sequencing procedure when 
the M13 bacteriophage cloning 
system, which provided the single- 
stranded DNA templates required for 
sequencing, was developed. Sanger 
and Gilbert received the Nobel Prize 
in Chemistry in 1980 for this work. 

The ability to sequence DNA mole¬ 
cules has been, directly and indirectly, 
responsible for both the dramatic 
upsurge in studies of the molecular 
basis of human diseases and the 
development of diagnostic and thera¬ 
peutic treatments for these disorders. 
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FIGURE 4.14 First PCR cycle. The target DNA lies between sequences T and 2 on one 
strand and between sequences 1 and 2' on the complementary strand. Two primers 
(PI and P2) are mixed with the sample DNA. The mixture, which also contains Taq 
DNA polymerase and the four deoxyribonucleotides (deoxyribonucleoside triphos¬ 
phates [dNTPs]), is heated to 95°C to denature the DNA and then slowly cooled to 
55°C. The primers, which are present in excess, base pair to the original DNA 
strands of the sample during the renaturation (annealing) step. The temperature is 
raised to about 75°C, and DNA synthesis commences from the 3' hydroxyl end of 
each primer sequence and continues past the region of the template DNA strand 
that is complementary to the other primer sequence. The products of this reaction 
are two long strands of DNA (long templates) that serve as additional DNA tem¬ 
plates during the second PCR cycle. 


beyond the endpoint of the sequence that is complementary to the second 
primer. These new strands form "long templates" that are used in the 
second cycle (Fig. 4.14). 

During the second cycle, the original DNA strands and the new strands 
synthesized in the first cycle (long templates) are denatured and then 
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FIGURE 4.15 Second PCR cycle. The templates for this cycle are the long templates 
synthesized during the first PCR cycle and the original DNA strands. The primers 
hybridize to complementary regions in both the original strands and the long tem¬ 
plate strands, and DNA synthesis produces more long template strands from the 
original strands and short template strands from the long template strands. A short 
template has a primer sequence at one end and the sequence complementary to the 
other primer at its other end. 
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hybridized with the primers. The large molar excess of primers in the reac¬ 
tion mixture ensures that they will hybridize to the template DNA before 
complementary template strands have the chance to reanneal to each other. 
A second round of synthesis produces long templates from the original 
strands, as well as some DNA strands that have a primer sequence at one 
end and terminate with a sequence complementary to the other primer at 
the other end ("short templates") that were generated from the long tem¬ 
plates (Fig. 4.15). 

During the third cycle, short templates, long templates, and original 
strands all hybridize with the primers and are replicated (Fig. 4.16). In sub¬ 
sequent cycles, the short templates preferentially accumulate, and by the 
30th cycle, these strands are about a million times more abundant than 
either the original or long template strands (Fig. 4.17). PCR has become a 
pervasive technique that is used for innumerable purposes, some of which 
are described here and many others in the ensuing chapters. 

PCR Amplification of Full-Length cDNAs 

A PCR-based method that enriches for full-length cDNA molecules entails 
adding an oligonucleotide that consists of an oligo(dT) sequence followed 
by a PCR primer sequence at the 5' end to a purified poly(A) mRNA prep¬ 
aration (Fig. 4.18). The first cDNA strand is synthesized with the enzyme 
reverse transcriptase, which catalyzes the synthesis of a DNA strand using 
RNA as a template. When reverse transcriptase reaches the 5' end of an 
RNA template, its terminal transferase activity, which does not require a 
template, adds additional nucleotides that consist predominantly of cyto¬ 
sines. A second oligonucleotide in the reaction mixture that has a poly(dG) 
sequence at its 3' end and a PCR primer sequence at the 5' end base pairs 
with the poly(dC) tract at the end of each full-length cDNA first strand. 
Reverse transcriptase uses the sequence of the second oligonucleotide, 
including the primer sequence, as a template to extend the cDNA first 
strand at the 3' end. The reaction conditions prevent tandem repeats from 
forming at the 5' ends of the full-length first cDNA strands. Next, reverse 
and forward PCR primers are added to the reaction mixture, and full- 
length double-stranded cDNA molecules are generated by PCR. Incomplete 
cDNA molecules do not have oligo(dC) tracts at their 5' ends. Consequently, 
they lack the necessary complementary sequence for the forward primer, 
and as a result, they are not amplified. Moreover, sequences for restriction 
endonuclease sites that facilitate cloning into a vector may be included as 
part of the original oligo(dT)-primer and primer-oligo(dG) oligonucle¬ 
otide sequences. This PCR amplification strategy has been dubbed SMART 
(which stands for switching mechanism at 5' end of RNA transcript) cDNA 
synthesis by its developers. 

Gene Synthesis by PCR 

The assembly of a gene by PCR is faster and more economical than filling 
in overlapping oligonucleotides using DNA polymerase and then sealing 
the nicks with T4 DNA ligase. One PCR-based protocol for total gene 
construction starts with two overlapping oligonucleotides (A and B) that 
represent sequences from the center of the gene (Fig. 4.19). After being 
annealed, these oligonucleotides have recessed 3' hydroxyl groups that 
provide a starting point for DNA synthesis during the elongation phase 
of a PCR cycle. A double-stranded DNA molecule is formed by a filling-in 


FIGURE 4.16 Third PCR cycle. During the renaturation 
step, the primer sequences hybridize to complementary 
regions of original, long-template, and short-template 
strands, and DNA synthesis produces long templates 
from the original strands and short templates from both 
the long and short templates. 
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reaction. This 4-minute cycle is repeated 20 times to maximize the amount 
of the product that is formed. Next, two additional oligonucleotides (C 
and D) are added to the mixture. Oligonucleotide C overlaps at its 3' end 
with the 5' end of oligonucleotide A and represents the nucleotide 
sequence of the gene immediately upstream of the oligonucleotide A 
sequence. Oligonucleotide D overlaps at its 3' end with the 5' end of oli¬ 
gonucleotide B and represents the nucleotide sequence of the gene imme¬ 
diately downstream of the oligonucleotide B sequence. After 20 
denaturation, renaturation, and synthesis cycles, a double-stranded DNA 
with a specific sequence order (CABD) is produced. 

Thereafter, pairs of oligonucleotides are added, one of the pair overlap¬ 
ping the upstream sequence of the DNA molecule formed in the previous 
round and the other overlapping the downstream sequence, and subjected 
to 20 PCR cycles for each pair added until the entire gene is formed. 
Generally, the oligonucleotides are about 50 nucleotides long. Thus, the 10 


FIGURE 4.18 PCR amplification of full-length cDNAs. An oligonucleotide with 
oligo(dT) and an added sequence [oligo(dT)-primer] is used by reverse tran¬ 
scriptase to initiate first-strand cDNA synthesis from poly(A) mRNA templates (1). 
The terminal transferase activity of reverse transcriptase adds mostly deoxycyti- 
dines (dCs) to the end of each full-length first-strand cDNA molecule (1). A primer- 
deoxyguanosine (dG) oligonucleotide that base pairs with the dC tail (2) acts as a 
template for reverse transcriptase to extend the first-strand cDNA at the 3' ends (3). 
Forward and reverse primers that have the same sequences as the primer-dG and 
oligo(dT)-primer oligonucleotides, respectively, are added to the first-strand cDNA 
mixture (4), and full-length double-stranded cDNAs are generated by PCR amplifi¬ 
cation (5). 
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FIGURE 4.17 Thirtieth PCR cycle. By the 
30th cycle, the population of DNA mol¬ 
ecules in a reaction tube consists almost 
entirely of short template strands. 
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FIGURE 4.19 Gene synthesis by PCR. Overlapping oligonucleotides (A and B) are filled 
in from the recessed 3' hydroxyl ends during DNA synthesis. Oligonucleotides (C 
and D) that are complementary to the ends of the product of the first PCR cycle are 
added to a sample, overlapping molecules are formed after denaturation and rena- 
turation, and the recessed ends are filled in during DNA synthesis. Next, oligonucle¬ 
otides (E and F) that overlap the ends of the second-cycle PCR product are added to 
a sample, and a third PCR cycle is initiated. The final PCR product is a double- 
stranded DNA molecule with a specified sequence of nucleotides. The pairs of letters 
with or without a prime (e.g., A' and A) represent complementary oligonucleotides. 
Each oligonucleotide corresponds to a sequence from a particular DNA strand. 


rounds of 20 4-minute PCR cycles that are required to synthesize a gene 
with 1,000 bp can be carried out in 1 day. In addition, as with other methods 
for assembling genes, the last pair of oligonucleotides (i.e., the 5' and 3' 
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FIGURE 4.19 (continued) 

ends of the gene) can be made with supplementary sequences outside the 
coding region that facilitate the cloning of the gene into a vector and, at the 
5' end, with sequences that enable the gene to be expressed in a host cell. 


DNA-Sequencing Techniques 

In molecular terms, the definitive understanding of a DNA molecule comes 
from determining its nucleotide sequence. The function of a gene can often 
be deduced from its nucleotide sequence. For example, a presumptive 
amino acid sequence, determined from the nucleotide sequence, can be 
compared with protein sequences of known genes, and significant sequence 
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similarity generally indicates a protein with an equivalent function. Also, 
distinctive coding regions, such as DNA-binding sites, receptor recognition 
sites, and transmembrane domains, can be ascertained. The nucleotide 
sequences in noncoding regions (regions that do not encode a protein or 
RNA molecule) may provide information about the regulation of a gene. In 
addition, nucleotide sequence information is essential for molecular- 
cloning studies and for characterizing gene activity. 

For more than 3 decades, the dideoxynucleotide procedure developed 
by Fred Sanger has been used for DNA sequencing. This includes 
sequencing of DNA fragments containing one to a few genes and also 
many whole genomes, including the human genome. However, the interest 
in sequencing large numbers of DNA molecules in less time and at a lower 
cost has driven the recent development of new sequencing technologies 
that can process thousands to millions of sequences concurrently (a term 
often used to describe this is massive parallelization). Many different 
sequencing technologies are currently being explored; however, those that 
have reached the commercialization stage are largely based on the princi¬ 
ples of sequencing by synthesis, which includes pyrosequencing and 
sequencing using reversible chain terminators, and on sequencing by liga¬ 
tion. In general, these new, second-generation methods involve repeated 
cycles of (1) enzymatic addition of nucleotides to a primer based on com¬ 
plementarity to a template DNA fragment and (2) detection and identifica¬ 
tion of the nucleotide(s) added. The techniques differ in the method by 
which the nucleotides are extended, employing either DNA polymerase to 
catalyze the addition of a single nucleotide or ligase to add a short, comple¬ 
mentary oligonucleotide, and in the method by which the addition is 
detected. The development of these promising new technologies notwith¬ 
standing, the Sanger dideoxynucleotide procedure is still the most com¬ 
monly used method today and is well suited for small-scale sequencing 
projects (in the kilobase-to-megabase range). 

Dideoxynucleotide Procedure for Sequencing DNA 

A dideoxynucleotide is a human-made molecule that lacks a hydroxyl 
group at both the 2' and 3' carbons of the sugar moiety (Fig. 4.20A). In con- 

FIGURE 4.20 (A) A dideoxynucleotide. The 2' and 3' carbons of the sugar moiety lack 
hydroxyl groups. (B) A deoxyribonucleotide. 
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trast, a natural deoxyribonucleotide has a 3' hydroxyl group on the sugar 
unit (Fig. 4.20B). Normally, during DNA replication, an incoming natural 
nucleoside triphosphate is linked by its 5' a-phosphate group to the 3' 
hydroxyl group of the last nucleotide of the growing chain (Fig. 4.21). 
However, if a dideoxynucleotide is incorporated at the end of the growing 
chain, DNA synthesis stops because a phosphodiester bond cannot be 
formed with the next incoming nucleotide (Fig. 4.22). The termination of 
DNA synthesis is the quintessential feature of the dideoxynucleotide DNA- 
sequencing method, although other experimental conditions must be met 
before a DNA sequence can be determined. 

In principle, the first step in the standard laboratory procedure for 
dideoxynucleotide DNA sequencing entails annealing a synthetic oligo¬ 
nucleotide (17- to 24-mer; primer) to a predetermined segment of a strand 
of the DNA to be sequenced, for example, to a segment of a cloning vector 
near the insertion site of the cloned DNA. The oligonucleotide acts as a 
primer by providing a 3' hydroxyl group for the initiation of DNA syn¬ 
thesis. In the original method, the primed DNA sample is partitioned into 


FIGURE 4.21 Normal DNA synthesis. An incoming deoxyribonucleotide (deoxyribo- 
nucleoside triphosphate [dNTP]) base pairs with the complementary nucleotide of 
the template strand. The intemucleotide linkage occurs between the 3' hydroxyl 
group of the last nucleotide of the growing strand and the a-phosphate group of the 
incoming nucleotide. 
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FIGURE 4.22 Blocked DNA synthesis. Chain growth is stopped by the addition of a 
dideoxynucleotide to the end of the growing strand. The intemucleotide linkage 
between the last nucleotide, which is a dideoxynucleotide, and the next incoming 
nucleotide cannot be formed because there is no 3' hydroxyl group on the dideoxy¬ 
nucleotide sugar. 


four separate reaction tubes. Each tube contains four deoxyribonucleotides 
(dATP, dCTP, dGTP, and dTTP), one of which is radiolabeled, and one of 
the four dideoxynucleotides (dideoxyadenosine triphosphate [ddATP], 
dideoxycytidine triphosphate [ddCTP], dideoxyguanosine triphosphate 
[ddGTP], or dideoxythymidine triphosphate [ddTTP]). The concentration 
of each dideoxynucleotide in each reaction tube is carefully adjusted to 
ensure that it is incorporated into the mixture of growing chains at every 
possible site. Recall that chain growth stops as soon as a dideoxynucleotide 
is incorporated, so each growing chain will eventually contain a dideoxy¬ 
nucleotide at its 3' terminus. This experimental condition is met in each of 
the four reaction tubes. Consequently, after enzymatic DNA synthesis with 
DNA polymerase, each reaction tube will contain a unique set of single- 
stranded DNA molecules of all possible lengths, each of which includes the 
primer sequence at its 5' end (Fig. 4.23). 

The synthesis reactions are stopped by the addition of formamide, 
which prevents DNA strands from base pairing, and the DNA molecules, 
including the newly synthesized DNA molecules, are separated by poly¬ 
acrylamide gel electrophoresis. This separation procedure resolves pieces 
of DNA that differ in size by as little as a single nucleotide. An autoradio¬ 
graph of the gel shows only the radiolabeled DNA fragments that were 
produced during the enzymatic DNA synthesis step. Each of the four lanes 
in the gel and the autoradiograph corresponds to a reaction tube that con¬ 
tained one of the four dideoxynucleotides. 

The sequence of a segment of DNA is determined by noting the order 
of the bands, as accurately as possible, from the bottom to the top of the 
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autoradiograph. In the example shown in Fig. 4.24, the first 6 bases of the 
sequenced DNA are AGCTGC. In this case, the fastest-migrating band (the 
radiolabeled fragment closest to the bottom), which corresponds to the 
smallest DNA fragment, is in the ddATP lane, the next band is in the 
ddGTP lane, the next is in the ddCTP lane, the next is in the ddTTP lane, 
and so on. Up to 500 bands can be resolved reliably on most autoradio¬ 
graphs. Usually, the primer sequence is positioned about 10 to 20 nucle¬ 
otides away from the target DNA so that the researcher can recognize the 
known sequence at the start of the autoradiograph and thereby identify 
precisely the first nucleotide of the target DNA. 

Although the procedure described above is still used for some special¬ 
ized applications, in practice, the entire procedure has been automated and 
typically uses nucleotides labeled with fluorescent dyes that are detected 
using a laser, rather than radiolabeled nucleotides which are visualized on 
an autoradiograph. Automated DNA sequencing minimizes manual 
manipulations and increases the rate of acquiring sequence data, which is 
essential for assembling vast amounts of nucleotide sequence data from 
whole prokaryotic and eukaryotic genomes. Automated sequence analysis 
can be carried out with four different fluorescent dyes, one for each dide- 
oxynucleotide reaction, or with the same fluorescent dye for each dideoxy- 
nucleotide in each reaction mixture. In some cases, the primer, rather than 
a dideoxynucleotide, is labeled with a fluorescent dye. With a four-fluores¬ 
cent-dye system, the samples at the completion of each reaction are pooled. 


FIGURE 4.23 Primer extension during DNA synthesis in the presence of dideoxy- 
nucleotides. Each of the four reaction tubes contains a unique set of nucleotide 
extensions attached to the primer, because when a dideoxynucleotide is incorpo¬ 
rated into the growing strand, it terminates the synthesis. A few full-length DNA 
molecules will be synthesized in each reaction tube. dNTP, deoxyribonucleoside 
triphosphate. 
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FIGURE 4.24 Simulated autoradiograph of a dideoxynucleotide DNA-sequencing 
gel. Each lane of the gel was loaded with the contents of one of the four reaction 
tubes, which contained either ddATP, ddCTP, ddGTP, or ddTTP. By convention, the 
bands of the autoradiograph are read from the bottom to the top. In this example, 
the results of the sequence determination are shown on the right. 


and the fragments are separated in a single lane of a polyacrylamide gel or 
polymer-filled capillary tube. This type of analysis is called "four-color, 
one-lane" detection. Alternatively, with one fluorescent dye marker, each 
sample is run in a separate lane; this is "one-color, four-lane" detection. 

Each fluorescent dye emits a narrow spectrum of light with a distinc¬ 
tive peak when it is struck by an argon ion laser beam. The beam scans a 
fixed location near the bottom of the electrophoretic matrix. As each succes¬ 
sive labeled fragment passes through the beam, excitation by the laser 
causes an emission with specific spectral features that is detected by a pho¬ 
tomultiplier tube. The emission data are recorded and stored in a computer. 
After a run is completed, the succession of fluorescent signals is converted 
to nucleotide sequence information. For a four-color, one-lane system, each 
fluorescent dye emits a different wavelength, and the order of spectral 
responses in a single lane corresponds to the sequence of nucleotides (Fig. 
4.25 and 4.26A to C). In other words, each dye represents a particular nucle¬ 
otide. With a one-color, four-lane system, the fluorescent signals from the 
dideoxynucleotide-terminated fragments are recorded in succession across 
the four lanes. In this case, the overall order of the fluorescent signals as a 
function of each lane corresponds to a nucleotide sequence. Parenthetically, 
DNA sequencing with radioactive label is equivalent to the one-color, four- 
lane format (Fig. 4.26D). 
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FIGURE 4.25 Automated fluorescent-dye terminator Sanger DNA sequencing. 
Fragments that are synthesized with a primer (highlighted in pale green) from a 
DNA template (highlighted in pale blue) and terminated with fluorescent dideoxy- 
nucleotides are separated by capillary electrophoresis. A detector records each suc¬ 
cessive laser-induced fluorescent signal (not shown). The successive fluorescent 
signals are represented as a sequencing chromatogram (colored peaks). In this case, 
the fluorescence for dideoxyadenosine (ddA), dideoxyribosylthymine (ddT), dide- 
oxyguanosine (ddG), and dideoxycytidine (ddC) aregreen, purple, orange, and blue, 
respectively. Commercial automated DNA sequencers run 96 (Applied Biosystems 
3730x1) and 384 (GE Healthcare MegaBACE 4550) samples concurrently. 


Generally, automated DNA-sequencing systems can read with high 
accuracy about 500 bases per run; under optimal conditions, one instru¬ 
ment can resolve about 20,000 bases per hour. The electrophoretic matrix 
for separating dideoxynucleotide-terminated products may be a slab gel or 
a liquid polymer in a capillary tube. Automated capillary DNA-sequencing 
machines handle larger numbers of samples with faster separations than 
do units that analyze slab gels. 

To produce large amounts of dideoxynucleotide-terminated fragments 
from small amounts of template DNA, PCR-based cycle sequencing is com¬ 
monly used. The setup and components for this method are the same as 
those for the standard automated dideoxynucleotide sequencing protocol 
except that a thermostable DNA polymerase is required because the pro¬ 
cess entails 25 or more cycles of denaturation, annealing, and elongation 
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FIGURE 4.26 DNA sequence determination by the dideoxynucleotide procedure. (A 
and B) Computer readout (A) of the nucleotide sequence from a four-color, one- 
lane sequencing gel (B). The numbers denote nucleotide positions. (C) Electrophoretic 
separation of the fluorescently labeled dideoxynucleotide fragments in a single lane 
that correspond to the nucleotide sequence shown in panel A. (D) Autoradiograph 
of a four-lane, one-label sequencing gel for the nucleotide sequence shown in panel 
A. Redrawn from a figure supplied by Applied Biosystems. 


(Fig. 4.27). Since there is only a single primer in each dideoxynucleotide 
reaction, the amplification of fragments is linear. The high temperature 
both reduces intrastrand base pairing (secondary structure), which blocks 
strand elongation, and minimizes false priming due to incomplete base 
pairing (mismatching) between the primer and DNA template. After the 
last cycle, formamide-treated samples are run in either one or four lanes, 
depending on the format of the experiment, and the sequence is deter¬ 
mined. Under optimal conditions, cycle sequencing resolves between 600 
and 800 nucleotides at a time. 

Primer Walking 

The entire sequence of pieces of DNA longer than about 500 bp cannot 
typically be obtained in a single run in most sequencing systems, and there¬ 
fore, a number of different strategies are used to obtain the complete 
sequence of large DNA fragments. In one commonly used strategy, the dide¬ 
oxynucleotide sequencing reactions are carried out to determine the identity 
and order of the first 500 or so nucleotides of the DNA. On the basis of this 
analysis, a second primer that is designed to hybridize to a region about 20 
nucleotides upstream from the end of the acquired sequence is chemically 
synthesized and then used to determine the sequence of the next 500 nucle¬ 
otides. In a similar manner, a third primer-binding site is selected, another 
oligonucleotide is synthesized, and the sequence of the next 500 bases is 
determined (Fig. 4.28). This "primer-walking" strategy proceeds until the 
entire cloned DNA is sequenced. To ensure that the overall sequence is cor¬ 
rect and that there is no ambiguity regarding the identity of any nucleotide, 
both strands of the DNA must be sequenced. Different initial primers enable 
sequencing of both strands; one primer binds to the strand at one end of the 
insert, and the other primer binds to the opposite strand at the other end of 
the insert. False priming of DNA synthesis can give erroneous and ambig- 
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FIGURE 4.27 Cycle sequencing. The approximate temperatures and durations of 
each step are noted. The primer is green, the template DNA is blue, and the dide- 
oxynucleotide-terminated fragments are red. 


uous results. This situation may arise if the primer binds to more than one 
region within the target DNA. To avoid this problem, the primers used for 
this procedure are generally at least 24 nucleotides long. In addition, strin¬ 
gent annealing conditions do not permit spurious binding of the primer to 
similar but nonidentical sequences. Primer walking has been used to 
sequence pieces of DNA that have been cloned into bacteriophage A (~20 
kilobase pairs [kb]) or a cosmid vector (~45 kb). 

Pyrosequencing 

Pyrosequencing was the first of the second-generation sequencing technolo¬ 
gies to be made commercially available and has contributed to the recent 
rapid output of large amounts of sequence data by the scientific community. 
The basis of the technique is the detection of pyrophosphate that is released 
during DNA synthesis. As part of the structure of a deoxynucleoside 
triphosphate, the phosphate attached to the 5' carbon of the deoxyribose 
sugar moiety is designated the a-phosphate, the next phosphate is the 
(3-phosphate, and the third is the y-phosphate (Fig. 4.29). During replication, 
the a-phosphate of each incoming complementary nucleotide is joined 
enzymatically by a phosphodiester linkage to the 3' OH group of the last 



























126 


CHAPTER 4 


Plasmid DNA Cloned DNA 



PI 

2 _ 



P2 



P3 

6 _ 

FIGURE 4.28 DNA sequencing by primer walking. (1) DNA sequencing is initiated 
with a primer (PI) that is complementary to a site on a plasmid near the point of 
insertion of the cloned DNA. (2) Based on the segment of the cloned DNA that has 
just been sequenced, a second primer (P2) that is complementary to a stretch of 
about 20 nucleotides near the end of that segment is synthesized. (3) P2 is used to 
sequence the next segment of cloned DNA. (4) Based on the segment of the cloned 
DNA that has just been sequenced, a third primer (P3) that is complementary to a 
stretch of about 20 nucleotides near the end of that segment is synthesized. (5) P3 is 
used to sequence the next segment of cloned DNA. (6) Based on the segment of the 
cloned DNA that has just been sequenced, a fourth primer that is complementary 
to a stretch of about 20 nucleotides near the end of that segment is synthesized. The 
process of successively synthesizing and using new primers continues until the 
entire insert is sequenced. 


nucleotide of the growing strand, and the [3- and '/-phosphates are cleaved 
off as a unit that is called pyrophosphate (Fig. 4.29). 

The unambiguous detection of pyrophosphate during DNA strand syn¬ 
thesis forms the basis for determining the DNA sequence of a template 
strand. Specifically, the release of pyrophosphate is correlated with the 
incorporation of a known nucleotide in the growing DNA strand. A DNA 
fragment of unknown sequence is engineered at one end with a sequence 
that is complementary to a primer, and then, after the primer is added, one 
deoxynucleotide is introduced at a time in the presence of DNA polymerase. 
Pyrophosphate is formed only when the complementary nucleotide is incor¬ 
porated at the end of the growing strand. Obviously, nucleotides that are not 
complementary to the nucleotide in the template strand are not incorporated 
and no pyrophosphate is formed. Thus, for this system, it was necessary to 
develop an accurate and rapid method for detecting pyrophosphate. 

The strategy for pyrosequencing entails a series of enzymatic reactions 
(Fig. 4.30). Briefly, the pyrophosphate generated by the incorporation of a 
nucleotide is combined with adenosine-5'-phosphosulfate by the enzyme 
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Pyrophosphate 


FIGURE 4.29 Phosphodiester bond formation and release of pyrophosphate during 
the incorporation of a nucleotide at the end of a growing DNA strand. Phosphodiester 
bond formation occurs between the 3' OH of the deoxyribose sugar of the last incor¬ 
porated nucleotide and the a-phosphate of the incoming nucleotide (blue arrow). 
The bond between the a- and (3-phosphates is cleaved (green arrow), and pyrophos¬ 
phate is released (black arrow). 
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FIGURE 4.30 Pyrosequencing enzyme reac¬ 
tions. Black, nucleotide incorporation and 
release of pyrophosphate; red, generation 
of light from pyrophosphate with ATP 
sulfurylase and luciferase. The DNA 
sequence is determined by correlating the 
extent of light emission with a particular 
nucleotide. Green, breakdown of unincor¬ 
porated deoxynucleoside triphosphate 
(dXTP) and any remaining ATP to mono¬ 
phosphates (dXMP) by apyrase. AMP, 
adenosine monophosphate. 
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Specific Enzymatic Amplification of DNA In Vitro: the 
Polymerase Chain Reaction 

K. B. Mullis, F. A. Faloona, S. J. Scharf, R. K. Saiki, G. T. Horn, and 
H. A. Erlich 

Cold Spring Harbor Symp. Quant. Biol. 51:263-273,1986 


CR, which is the invention of 
Kary Mullis (U.S. patent 
4,683,202), has had a tremen¬ 
dous impact on many research areas, 
including molecular biotechnology. 
The capability of generating large 
amounts of DNA by amplification 
from segments of cloned or genomic 
DNA has facilitated the cloning of 
DNA versions of rare mRNA mole¬ 
cules, screening gene libraries, diag¬ 
nostic testing for gene mutations, 
physical mapping of chromosomes, 


and a myriad of other applications. In 
fact, the first study using PCR 
described a diagnostic test for sickle¬ 
cell anemia (Saiki et al., Science 
230:1350-1354,1985). PCR was a 
unique idea that did not replace any 
existing technology. The power of the 
method is in its simplicity, sensitivity, 
and specificity. It utilizes a mechanism 
similar to that used by our cells to 
accurately replicate a DNA template, 
it can detect and produce millions of 
copies from a single template mole¬ 


cule in a few hours, and under appro¬ 
priate conditions, it can be used to 
amplify a specific sequence in a com¬ 
plex mixture of DNA molecules even 
when other, similar sequences are 
present. Mullis received the Nobel 
Prize in Chemistry for his work on 
PCR in 1993. Since 1986, more than 
200,000 published studies have used 
PCR in one way or another. Moreover, 
a Google search with the phrase 
"polymerase chain reaction" yields 
more than 17,000,000 hits! Its status as 
an indispensable method is well estab¬ 
lished, and considering the large 
number of PCR applications that have 
already been devised, there seems to 
be no end to its potential uses. 



ATP sulfurylase to form ATP. In turn, ATP drives the conversion of luciferin 
to oxyluciferin by the enzyme luciferase, which generates light that is 
recorded by a photon detector. Before the next nucleotide is added to the 
mixture, ATP and any unincorporated deoxynucleoside triphosphate is 
degraded to its monophosphate form by the enzyme apyrase, which is an 
ADP diphosphohydrolase that removes the y-phosphate from nucleoside 
triphosphates and the (3-phosphate from nucleoside diphosphates (Fig. 
4.30). Because the natural nucleotide dATP can participate in the luciferase 
reaction, deoxyadenosine a-thiotriphosphate (dATPaS), which is used by 
DNA polymerase but not luciferase, is substituted for dATP in the reaction 
mixture. 

The amount of light generated after the addition of a particular nucle¬ 
otide tends to be proportional to the number of nucleotides that are incor¬ 
porated in the growing strand (Fig. 4.31). The incorporation of any single 
nucleotide produces an amount of light that falls within a limited range that 
represents the incorporation of one nucleotide, i.e., a 1-mer. A dinucleotide 
sequence, e.g., AA, TT, CC, or GG, generates a more intense signal that falls 
within the 2-mer range. This more or less linear relationship holds only for 
stretches of about eight identical nucleotides in a row (homopolymer tracts). 
Generally, regardless of the DNA-sequencing chemistry, homopolymer 
tracts are difficult to sequence accurately. 

During each incubation, the amount of the light signal is measured and 
recorded. At the end of the sequencing run, the light emissions are repre¬ 
sented as a bar on a graph (pyrogram) where the y axis reveals whether the 
incorporation is equivalent to one or more nucleotide units (mers) and the 
x axis presents the sequence of the strand that is complementary to the 
template sequence (Fig. 4.31C). 

Sequencing Using Reversible Chain Terminators 

For pyrosequencing, in which DNA is sequenced by synthesis, that is, by 
nucleotide extension of a growing DNA strand, each of the 4 nucleotides 
must be added to the reaction sequentially in separate cycles. This process 
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would be considerably faster if, for each cycle, all the nucleotides were 
added together. For this, it is necessary to ensure that the growing DNA 
strands are extended by only a single nucleotide during each cycle and that 
the incorporated nucleotides are recognized individually. These objectives 
can be met with reversible chain terminators and four-color fluorescence, 
which form the basis of some new sequencing technologies. 

One approach entails capping the 3' carbon of the deoxyribose sugar 
with a chemical group that blocks subsequent addition of nucleotides and 
attaching a different fluorophore to each nucleotide at positions that do 
not interfere with either base pairing or phosphodiester bond formation 


FIGURE 4.31 DNA sequence determination by pyrosequencing. (A) Template strand 
(blue background) with primer sequence (blue letters). (B) Signal (light) intensities 
with four pyrosequencing cycles based on the template sequence in panel A. A 
pyrosequencing cycle consists of four rounds, where one round represents the addi¬ 
tion of one of the four deoxynucleotides. Each round is followed by treatment with 
apyrase. A minus sign denotes no light emission, and the plus signs indicate the 
relative amount of light released after the introduction of a particular deoxynucle- 
otide. (C) Pyrogram based on the data in panel B and the deduced nucleotide 
sequence of the strand that is complementary to the template sequence in panel A. 
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(Fig. 4.32). It is essential that both the 3' blocking group and the fluorescent 
dye be quickly and easily removed. Also, the decapping step must restore 
a hydroxyl group at the 3' position of the deoxyribose sugars to provide a 
site for phosphodiester bond formation with the next nucleotide. After 
fluorescent emissions are recorded, only 30 seconds is required to remove 
both the 3' blocking group and the fluorescent dye from each incorporated 
nucleotide and to leave a hydroxyl group at the 3' position of the deoxyri¬ 
bose sugars. Cycles of nucleotide addition to the growing DNA strand by 
DNA polymerase, acquisition of fluorescence data, and chemical cleavage 
of the blocking and dye groups are repeated to generate short sequence 
determinations (reads) of up to 36 nucleotides per run. Currently, read 
lengths are limited by difficulties in incorporating the fluorescent nucle¬ 
otides and incomplete cleavage of the dye or blocking groups. 

Sequencing by Ligation 

In contrast to pyrosequencing and sequencing using reversible terminators, 
which extend the growing DNA strand by a single base during each cycle, 
sequencing by ligation extends the DNA strand by ligation of short oligo¬ 
nucleotides in a template-dependent fashion and utilizes the enzyme ligase 
rather than DNA polymerase. This procedure requires oligonucleotides 
that are usually 8 (octamers) or 9 (nonamers) nucleotides in length and are 
partially degenerate. That is, they contain a known (fixed) nucleotide at a 
specific (query) position and any nucleotide in the other positions. As 
shown in Fig. 4.33A, one set of nonamers has a fixed nucleotide (A, C, G, 
or T) in query position 1 and any nucleotide in positions 2 to 9; another set 
has a fixed nucleotide in query position 2 and any nucleotide in the other 
eight positions, i.e., 1 and 3 to 9, and so on. Moreover, all the nonamers with 
the same fixed nucleotide, regardless of position, are tagged at their 3' ends 
with the same fluorescent dye (Fig. 4.33A). Each of the four different fluo¬ 
rescent dyes emits a distinctive wavelength that can be used to identify the 
nucleotide in the query position. 

A short nucleotide adaptor is joined to the ends of the DNA fragments 
that are to be sequenced. The adaptor sequences serve as binding sites for 
anchor primers (Fig. 4.33B). Am anchor primer provides a 3' end for ligation 


FIGURE 4.32 Addition of reversible terminators to a primer sequence. A blocking 
group that prevents chain extension is attached to the oxygen of the 3' position of 
the deoxyribose sugars of each nucleotide. A different fluorescent dye (colored 
structures) is linked to each nucleotide by means of a blocking group and a linker 
sequence at position 9 of the purines (guanine [A] and adenine [B]) and position 5 
of the pyrimidines (cytosine [C] and uracil [D]). Palladium-catalyzed cleavage 
(arrows) removes the 3' blocking group and the fluorescent dyes with the restora¬ 
tion of a hydroxyl group at the 3' carbon of each deoxyribose sugar. During the next 
cycle, the growing strands are extended by one reversible terminator. The template 
strand is not shown here. Note that the difference between deoxyuridine mono¬ 
phosphate (dUMP) and deoxythymidine monophosphate (dTMP) is the absence of 
the methyl group at position 5 of the pyrimidine ring of dUMP. The uracil of the 
modified deoxyuridine triphosphate (dUTP) pairs with adenine of the template 
strand, and dUMP is joined to the last nucleotide of the growing strand by DNA 
polymerase. Adapted, with permission, from Fig. 7 of the online Supporting 
Information supplement at http://www.pnas.org/cgi/content/full/0609513103/ 
DC1 accompanying J. Ju et al., Proc. Natl. Acad. Sci. USA 103:19635-19640, 2006. 
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FIGURE 4.33 Sequencing by ligation. (A) Sets of nonamers. The fixed position of each 
nucleotide is denoted by a boldface colored letter. The colored dots represent dif¬ 
ferent fluorescent dyes. The letter N signifies that any nucleotide can occupy that 
position in the nonamer. There are more than 65,000 different nonamer sequences 
when one position is fixed and the other eight sites are filled by any of the four 
nucleotides. (B) Representation of the first three cycles of sequencing by ligation of 
a template sequence. During cycle 1, the anchor primer anneals to the adaptor 
sequence at the 3' end of the template sequence. Nonamers with A, T, C, and G in 
the first query position are added, and the complementary nonamer with A in the 
first position hybridizes and is ligated to the anchor primer. After the fluorescent 
signal is recorded, the ligated anchor primer-nonamer strand is released by 
melting. During the following two cycles, nonamers with T and C in the second and 
third positions, respectively, are ligated to the anchor primer. In this example, the 
successive color signals from the fluorescent dyes are red, blue, and orange, which 
correspond to A, T, and C. 


Anchor primer 


Cycle 3 

Adaptor sequence 


to the 5' end of an adjacent completely hybridized nonamer. Ligation does 
not occur if there is a single noncomplementary base pair (mismatch) 
between a nonamer and template sequence. 

A sequencing cycle consists of the following steps. (1) The template 
DNA is denatured, and the anchor primer binds to its complementary 
sequence in the adaptor at the 3' end of the single-stranded template DNA. 
(2) A pool of nonamers with fixed nucleotides A, G, C, and T at the same 
query position, say position 1, is added and incubated for a brief period 
with T4 DNA ligase. (3) The nonligated nonamers and other components 
are washed away, and the fluorescent signal is recorded. If the nonamer 
sequence is exactly complementary to the template sequence, then T4 DNA 
ligase will join the nonamer to the anchor primer (Fig. 4.33B). The identity 
of the nucleotide in position 1 is determined by the fluorescence produced. 
(4) The ligated anchor primer-nonamer sequence is removed by increasing 
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the temperature, and the cycle is repeated using another pool of nonamers 
with fixed nucleotides A, G, C, and T in query position 2 to identify the 
nucleotide in the second position on the template. These cycles are repeated 
to determine the sequences of short regions of DNA. 


Large-Scale DNA Sequencing 

Generally, DNA-sequencing projects fall into two categories: de novo 
genome sequencing and resequencing. Sequencing entire genomes that 
have not been previously sequenced is de novo genome sequencing, 
whereas resequencing entails comparing a newly determined sequence 
with a known reference sequence. Some of the applications for both de 
novo sequencing and resequencing include the identification of pathogenic 
strains, drug discovery, tests for disease-related mutations, forensic anal¬ 
yses, and development of biological products for commercial and indus¬ 
trial purposes. 

Over the past 30 years, the cost of sequencing using the Sanger dideoxy- 
nucleotide procedure has been lowered from more than $10 to less than 
$0.05 per base. In addition, the speed of acquisition of DNA sequences has 
been increased dramatically by the introduction of automated DNA 
sequencers that run, depending on the instrument, either 96 or 384 capillary 
electrophoretic separations concurrently. Generally, these machines gen¬ 
erate about 2 x 10 6 to 3 x 10 6 bases per day from more than 2,500 different 
DNA templates. The success of the Sanger dideoxynucleotide method is due 
to the highly accurate, long sequence reads of about 700 bases per sequencing 
reaction. These read lengths make it computationally easier to definitively 
determine overlapping sequences and thereby to assemble the nucleotide 
sequence of a genome, chromosome, or large gene, and for now, it remains 
the method of choice for de novo sequencing. However, the method is time- 
consuming, mainly due to the requirement for electrophoretic separation of 
the fragments, and expensive due to the relatively large volumes of chemi¬ 
cals that are used. Substantial research funds have been granted by the U.S. 
National Human Genome Research Institute for developing novel, inexpen¬ 
sive technologies to facilitate large-scale sequencing projects. The nominal 
objective of these efforts has been christened the "$1,000 genome." In other 
words, the goal is to reduce the cost of sequencing any human genome to 
$1,000. The National Human Genome Research Institute has targeted a 
"$100,000 genome" for 2009 and the "$1,000 genome" for 2014. In 2005, the 
sequencing of a human genome required about 10 months at a cost of about 
$15 x 10 6 , which is obviously less than the 15 years and the $2.7 x 10 9 that 
were required to assemble the first human genome sequence (completed in 
2003). The expectation is that a "$1,000 genome" would allow assessment of 
the risks of genetic diseases on an individual basis, as well as optimization 
of drug treatments for each patient, and of course, a "$1,000 genome" would 
mean that genome sequencing of microorganisms would cost a few dollars 
and that of single genes would cost pennies. 

Shotgun Cloning Strategy for Sequencing Genomes 

The shotgun cloning strategy has been used for de novo sequencing of thou¬ 
sands of prokaryotic, eukaryotic, and viral genomes. First, genomic DNA is 
randomly fragmented by sonication, nebulization, or hydrodynamic 
shearing. Breaking DNA by a physical method leaves extended (frayed) 
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FIGURE 4.34 Repair of the ends of frayed 
DNA (end repair) and phosphorylation 
of blunt-end DNA. (A) DNA poly¬ 
merase fills in from recessed 3' ends. 
(B) 3' exonuclease degrades 3' exten¬ 
sions. (C) T4 polynucleotide kinase 
phosphorylates the 5' ends of the blunt- 
end fragments. Note that fragments 
with different combinations of exten¬ 
sions and recessed ends are not shown 
here. In all of these cases, the poly¬ 
merase and/or 3'-exonuclease activity 
produces blunt-end DNA molecules. 


ends that make cloning extremely inefficient. Consequently, the fragments 
are enzymatically blunt ended (end repaired, or polished) by filling in 3' 
recessed ends with T4 DNA polymerase in the presence of the four deoxyri- 
bonucleotides and removing protruding 3' ends with 3' exonuclease activity 
(Fig. 4.34A and B). To facilitate ligation, the 5' ends of the polished genomic 
fragments are phosphorylated with T4 polynucleotide kinase (Fig. 4.34C). 

Next, the fragments are separated into small (~l-kb), medium (~8-kb), 
and large (~40-kb) fractions with which three libraries are created by cloning 
the small- and medium-size pieces into plasmids and the large, 40-kb frag¬ 
ments into a fosmid vector. Alternatively, the different-size DNA fractions 
may be isolated first and then end repaired separately before being cloned 
into different vectors. After transformation of E. coli with the library, colo¬ 
nies with cloned DNA are picked and grown. Vector DNA is purified from 
each library and amplified. Depending on the DNA sequencer, either 96 or 
384 sequencing templates are analyzed concurrently. To achieve a high 
degree of accuracy, each nucleotide site in a genome should be sequenced at 
least 6 to 10 different times from different fragments. The extent of 
sequencing redundancy is called coverage, or depth of coverage. 

A computer program, called a base caller, assesses the fluorescence 
chromatogram of each read and designates the nucleotides that most likely 
correspond to unambiguous peaks. After ignoring primer and vector 
sequences and removing low-quality base calls that commonly occur at the 
beginnings and ends of the chromatograms, another program, called an 
assembler, finds extensive overlapping segments. The process of gener¬ 
ating successive overlapping sequences produces long, contiguous stretches 
of nucleotides (contigs). 

Although assemblers are extremely effective, repetitive elements that 
in actuality occupy different genomic locations may be assigned to the 
same genomic region. This creates false contigs. This problem is overcome 
by using sequence information from both ends of an insert (paired ends, or 
mate pairs). Paired ends from the three libraries are situated ~1, ~8, or ~40 
kb apart, respectively. The assembler identifies sets of paired ends that, in 
turn, are used to order and orient contigs. This process is called scaffolding 
(Fig. 4.35). Linking contigs together may also be confounded by incomplete 
genomic libraries that are often due to certain sequences not being repli¬ 
cated in E. coli. Additional cloning and sequencing may be required to 
complete the overall sequence. After the assembly process is completed, 
any small gaps between contigs that remain are resolved (gap closure, or 
finishing) either by PCR of high-molecular-weight genomic DNA across 
each gap, followed by sequencing the amplification product, or by primer 
walking. The sequences adjacent to the gaps provide the information for 
the primer sequences. 

The genomes of more than 800 organisms, mostly bacteria, have been 
entirely sequenced by the shotgun strategy (Fig. 4.36), and incomplete (draft) 
versions of the genomes of more than 2,000 organisms are in the process of 

FIGURE 4.35 Scaffolding. A scaffold of three contigs (black) is detected by paired end 
sequences. The paired ends are matched by color. The dashed lines represent the 
different distances between the paired ends. 
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FIGURE 4.36 Flow diagram for whole-genome shotgun DNA sequencing. Samples 
of genomic DNA may be fragmented under different conditions to enhance the 
yield of small (~l-kb), medium (~8-kb), and large (~40-kb) pieces. It takes about 4 
weeks to go from genomic DNA to purified templates. The time from the acquisi¬ 
tion of raw sequence data to a finished (completed) sequence depends on the size 
of the genome, the extent of coverage, and the number of DNA sequencers that are 
used for the project. The procedures for cloning, colony selection, and template 
preparation apply to each library. 
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FIGURE 4.37 Features of adaptors A and B that are used for template preparation, 
PCR amplification, and sequencing using a cyclic array-sequencing strategy with 
the 454 sequencing platform. 


being finished. One major DNA-sequencing center (Joint Genome Institute, 
Walnut Creek, CA [http://www.jgi.doe.gov]) has 70 instruments that each 
analyze 96 samples at a time and run 24 hours a day, 7 days a week, along 
with another 40 instruments that each process 384 sequencing samples con¬ 
currently and run 24 hours a day, 5 days a week. The monthly output is about 
2.8 x 10 9 bases. Under these conditions, with six-times coverage and 1 month 
to prepare the sequencing samples, it would take about 8 months to deter¬ 
mine the genomic sequence of a single person. 

Cyclic Array Sequencing 

Cloning of large numbers of DNA fragments into plasmids, transforming 
the plasmids into bacterial cells, and then picking and extracting the plas¬ 
mids from thousands to millions of colonies for sequencing using the 
Sanger dideoxynucleotide method is very time-consuming and expensive, 
even though many of these steps have been automated. To reduce the time 
and cost of large-scale sequencing, cyclic array sequencing strategies have 
been developed that (1) prepare libraries of DNA fragments for sequencing 
in vitro, (2) immobilize the sequencing templates in a dense array on a 
surface so that very small volumes of reagents are required for the 
sequencing reactions, and (3) utilize a sequencing-by-synthesis approach 
so that hundreds of millions of sequences can be acquired simultaneously 
(in parallel). In comparison to the 8 months required to sequence a human 
genome using the shotgun cloning-Sanger sequencing approach, cyclic 
array sequencing can provide the sequence of a human genome in 2 
months. 

Cell-free methods have been developed to construct libraries of 
sequencing templates that circumvent the requirement for preparation of 
clone libraries in bacterial cells. Instead, PCR is used to produce clonal 
clusters containing millions of copies of each template DNA molecule that 
are separated from other sequences. The term "polony," which is a contrac- 
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tion of polymerase colony, has been coined to represent multiple copies of 
the same nucleotide sequence that are confined to an entity, such as within 
a bead or on a solid or gel surface. One cyclic array-sequencing strategy 
entails capturing the PCR-amplified sequencing templates on the surface of 
a small bead, arraying the beads in wells of a plate, and then using pyrose- 
quencing to determine the sequences of the captured templates. This 
strategy is often referred to as 454 sequencing after the company (454 Life 
Sciences) that developed the technology. 

The source DNA is fragmented to an average size of about 350 bp. The 
frayed ends are blunt ended, and the 5' ends are phosphorylated. Two non- 
phosphorylated adaptor sequences (A and B) are ligated to the polished, 
phosphorylated genomic DNA fragments (Fig. 4.37). The adaptors have 
sequences for PCR amplification of the genomic sequence, for priming the 
sequencing reaction, and for calibrating the signal output of the sequencing 
reaction. The elements of the long arm of adaptor A, reading from the 3' 
end, consist of a 4-nucleotide calibration (key) sequence, a 20-nucleotide 
sequencing primer site (left-template-specific primer; left-specific 
sequencing primer), and a 20-nucleotide PCR primer site (primer A). 
Adaptor B has a key sequence, a sequencing primer site (right-template- 
specific primer; right-specific sequencing primer), a PCR primer site 
(primer B), and a biotin tag on the 5' end (Fig. 4.37). Because the adaptors 
are not phosphorylated, during ligation, the 5' phosphate end of the 
genomic DNA is joined to the 3' hydroxyl group of an adaptor with a nick 
(gap) remaining at the 3' hydroxyl end of the genomic DNA. Complete 
double-stranded DNA molecules are formed by filling in from the 3' ends 
of the genomic DNA with a DNA polymerase that lacks exonuclease 
activity. The short adaptor strands are displaced during replication, and the 
long arms of the adaptors act as the templates (Fig. 4.38A). 

After ligation and filling in, a genomic DNA fragment may be flanked 
on each end by (1) an A and B adaptor, (2) only adaptor A, or (3) only 
adaptor B (Fig. 4.38B). The next step is designed to recover single-stranded 
DNA with adaptors A and B on each end of the genomic DNA fragment, 
i.e., A-genomic DNA-B. Streptavidin beads are added to the DNA sample, 
and the biotin on adaptor B binds to the streptavidin beads. Thus, both the 
A-genomic DNA-B and B-genomic DNA-B molecules are immobilized. 
On the other hand, the A-genomic DNA-A molecules do not bind and are 
subsequently washed away Next, after denaturation, only the nonbiotiny- 
lated strands of the A-genomic DNA-B molecules are released and col¬ 
lected. The biotinylated strands that remain bound to the streptavidin 
beads under these conditions are discarded (Fig. 4.38B). 

During the next phase of the procedure, each collected genomic 
sequence is individually amplified. To this end, 20-nucleotide oligomers 
with a sequence that is complementary to the PCR primer site of adaptor B 
are bound at their 5' ends to beads (DNA capture beads). Each DNA cap¬ 
ture bead carries more than 10 7 of these oligomers. Then, the isolated 
A-genomic DNA-B strands are mixed with the DNA capture beads in a 
ratio that allows only one DNA strand to base pair with one of the oli¬ 
gomers on a bead (Fig. 4.39). Next, the beads and PCR reagents, including 
PCR primers that anneal to sequences that are part of adaptors A and B, are 
stirred vigorously with oil to create a water-in-oil emulsion. The conditions 
are set so that there is a single bead, along with PCR components, in a 
water droplet within an oil globule (Fig. 4.40A). In other words, each oil 
globule is a separate reaction chamber. There may be as many as 1,000 of 
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FIGURE 4.38 Preparation of single-stranded genomic DNA fragments for the 454 
sequencing platform. (A) The frayed ends of genomic fragments are polished (end 
repaired), phosphorylated, and ligated with adaptors A and B. (B) The ligation 
products are mixed with streptavidin-coated beads. (1) The adaptor A-genomic 
DNA-adaptor A molecules do not bind to the streptavidin-coated beads and are 
washed out. (2) The DNA molecules with biotin tags bind to the streptavidin-coated 
beads. (3) The adaptor A-genomic DNA-adaptor B strands without a biotin tag are 
released by melting, concentrated, and retained for sequencing. (4) DNA molecules 
with biotin tags remain bound to the streptavidin-coated beads and are discarded. 


these PCR "microreactors" per microliter. This form of PCR has been des¬ 
ignated emulsion PCR. 

During PCR cycling in each microreactor, strands with the same 
sequence as the isolated A-genomic DNA-B molecule are synthesized and 
base pair with the immobilized oligomers. This is followed by the syn¬ 
thesis of a complete complementary strand from the 3' ends of the oli¬ 
gomers (Fig. 4.40B to D). After 20 to 30 cycles, each bead has about 10 7 
bound DNA molecules that are complementary to the original hybridized 
A-genomic DNA-B strand. Following PCR, the emulsion is broken, the 
beads are collected, and all the free DNA molecules are washed away. 
DNA strands that are base paired with the bound DNA strands are 
removed by melting them. Since not all of the DNA capture beads become 
enclosed in a PCR microreactor, it is necessary to concentrate the beads 
with attached full-length DNA extensions. These beads are enriched by 
hybridization to oligomers that are bound to magnetic beads and are 
complementary to the adaptor A sequence at the free 3' end of each full- 
length immobilized DNA molecule. All the DNA capture beads that do not 
hybridize are discarded. The DNA capture beads that are held by the oli¬ 
gomer attached to the magnetic beads are released by melting them. 
Finally, the magnetic beads are removed with a magnet. At this point, in 
preparation for DNA sequencing, a protein that binds to single-stranded 


FIGURE 4.39 DNA capture bead. Oligomers that are complementary to the PCR 
amplification sequence of adaptor B are attached at their 5' ends to a bead. Each 
DNA capture bead hybridizes with only one adaptor A-genomic DNA-adaptor B 
strand. The inset highlights the available 3' end of the immobilized oligomer. 
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FIGURE 4.40 Emulsion PCR. (A) Representation of a DNA capture bead with a 
hybridized A-genomic DNA-B strand and PCR amplification primers in a water- 
in-oil droplet (PCR microreactor) before the initiation of the PCR cycles. (B) DNA 
capture bead with the original hybridized adaptor A-genomic DNA-adaptor B 
strand and the complementary sequence extended from the 3' end of the immobi¬ 
lized oligomer. Blue, adaptor A; green, genomic DNA; red, adaptor B. (C) 
Representation of a DNA capture bead during PCR cycling with many immobilized 
full-length sequences that are complementary to the originally hybridized DNA 
strand. (D) A DNA capture bead with many copies of the same genomic sequence. 
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DNA, a sequencing primer that is complementary to the left-template- 
specific sequence of adaptor A, and DNA polymerase are added to the 
DNA capture beads. Although sample preparation may seem excessive, 
the actual process takes only 2 to 3 days. By contrast, cloning, growing 
bacteria, and preparing vector DNA for conventional whole-genome 
shotgun sequencing requires from 20 to 30 days. 

Pyrosequencing can be used to determine the nucleotide sequence by 
cycles of single-nucleotide addition followed by detection of light emission 
to determine which incorporated nucleotide releases pyrophosphate. The 
sequencing reactions are carried out in wells that hold picoliter volumes. 
(One picoliter is one millionth of a millionth of a liter.) Two other types of 
beads, in addition to the DNA capture beads (~28-pm diameter), are used 
to enhance the efficiency of the sequencing reaction. First, for pyrose¬ 
quencing, luciferase and ATP sulfurylase are bound to small beads (2.8-pm 
diameter). Second, microparticles (0.8-pm diameter), without any immobi¬ 
lized molecules, are used to maintain uniformity within the reaction wells 
and to support the DNA capture beads. The beads are mixed and applied 
to a plate that has 2.6 x 10 6 or 0.8 x 10 6 wells, depending on the size of the 
plate. Each hexagonal well is only large enough to accommodate one DNA 
capture bead (Fig. 4.41). Optic fibers mounted next to the plate transmit the 
light signal, which corresponds to the incorporation of a particular nucle¬ 
otide, from each of the wells to a sensor. The succession of signal outputs 
(flow signals) that corresponds to the sequence of incorporated nucleotides 
from each well is captured and stored in a computer. 

A single sequencing round consists of flooding the wells, in succession, 
with one of the four deoxyribonucleotides, pyrosequencing reagents, and 
finally apyrase. This process is repeated for each nucleotide. The duration 
of a round is about 60 seconds. The "key" sequence on the adaptors is used 
to locate the wells with DNA capture beads and to calibrate the signal 


FIGURE 4.41 Wells of a PicoTiterPlate used for 454 sequencing. Reproduced with the 
kind permission of Roche Diagnostics North America, Indianapolis, IN. 
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output. Generally, high-quality sequence information is generated from 200 
to 300 bases before the accuracy of the read drops off. At the end of a run, 
the flow signal data from the wells are combined, and an assembler deter¬ 
mines the most likely sets of contiguous nucleotides. This platform gener¬ 
ates about 25 x 10 6 raw base reads in 4.5 hours, which is about 10 times 
coverage for a genome with 2 x 10 6 bp. A range of large-scale sequencing 
projects have demonstrated the utility of this approach, for example, the 
sequencing of a human genome (the genome of the Nobel laureate James 
Watson was used as a test case) and the sequencing of DNA in complex 
environmental samples that contain many microbial genomes with the aim 
of identifying candidate pathogens responsible for the mysterious collapse 
of honeybee colonies. Sequencing of the Neanderthal genome showed that 
PCR amplification of sequencing templates is especially useful when only 
small amounts of low-quality template are available, as is usually the case 
for ancient DNA samples. 

Other cyclic array-sequencing strategies use emulsion PCR, or another 
PCR-based method, for clonal amplification of sequencing templates in 
combination with sequencing by ligation or sequencing by base extension 
using reversible chain terminators. Methods for sequencing single DNA 
molecules without amplification are also being explored. One of these 
approaches envisions translocating a DNA molecule through a very small 
channel (nanopore) and reading the nucleotide sequence from the succes¬ 
sive perturbations of electrical conductance caused by specific base pairs. 
An advantage of nanopore sequencing is that extensive template prepara¬ 
tion is not required. Moreover, hypothetically, sequence information could 
be generated in microseconds. Finally, notwithstanding the advances in 
large-scale DNA sequencing, it will be remarkable if any group ever wins 
the daunting $10 million challenge, called the X Prize, for devising a plat¬ 
form that will sequence 100 human genomes with 99.999% accuracy in 10 
days for no more than $10,000 per genome. 


SUMMARY 


I n addition to the various gene-cloning techniques, a number 
of other procedures, such as the chemical synthesis, ampli¬ 
fication, and sequencing of DNA, are fundamental tools of 
molecular biotechnology. The chemical synthesis of DNA 
using phosphoramidites is a stepwise method that produces 
single-stranded DNA molecules (oligonucleotides). A high 
efficiency of phosphodiester bond formation (coupling) is 
mandatory. Otherwise, at the end of the process, the sample 
contains very few full-length molecules. The most commonly 
used chemically synthesized oligonucleotides range from 
about 10- to 30-mers. Although the yields are low, for special 
applications, oligonucleotides with 80 to 100 nucleotides can 
be produced. To make double-stranded DNA molecules, the 
complementary strands are synthesized separately and then 
mixed together. Oligonucleotides are used as probes for 
screening gene libraries, as linkers and adaptors for cloning 
genes and adding novel restriction endonuclease sites to vec¬ 
tors, as primers for dideoxynucleotide DNA sequencing and 
PCR, and for assembling genes. 


PCR is an invaluable procedure that has innumerable 
applications. With this technique, specific segments of DNA 
are amplified over a millionfold. This amplification is achieved 
with two primers that hybridize to segments of DNA on oppo¬ 
site strands and have their 3' hydroxyl ends facing each other. 
The primers flank the target sequence. The process entails 30 
or more successive cycles, with each cycle consisting of dena- 
turation, renaturation, and in vitro DNA synthesis. Because it 
would be tedious and costly to add DNA polymerase at the 
end of each cycle, a DNA polymerase that is not inactivated at 
the high denaturation temperature (95°C) is used throughout. 
During the DNA renaturation step, the primer sequences, 
which are present in excess, hybridize to the sample DNA in 
the first cycle and to primer sites in the amplified DNA 
product molecules in subsequent cycles. In the DNA synthesis 
step, a new DNA strand that is complementary to its template 
strand grows from the 3' end of the primer. Among its various 
applications, PCR can be used to detect a specific nucleotide 
sequence in a biological sample, to obtain large amounts of a 
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particular DNA sequence either for cloning or for DNA 
sequencing, or to assemble a synthetic gene. 

Among other uses, knowledge of the complete sequence of 
a gene enables researchers to both optimize the function of a 
coding sequence in a particular host cell and maximize the 
production of an economically important protein. The dide- 
oxynucleotide method that was developed by Sanger and his 
colleagues is routinely used to sequence DNA. This technique 
relies on the in vitro incorporation of a dideoxynucleotide into 
a growing DNA strand. A dideoxynucleotide terminates the 
synthesis of a strand because it does not have a 3' hydroxyl 
group. In a sequencing experiment, four DNA synthesis reac¬ 
tions, each with a different dideoxynucleotide, are carried out 
independently. The DNA fragments of different lengths that 
are produced during each reaction are separated electropho- 
retically in an individual lane of a polyacrylamide gel. The 
pattern of separation of the synthesized radiolabeled frag¬ 
ments is used to determine the nucleotide sequence of the 
growing strand. Automated DNA sequencers that detect fluo¬ 
rescent dyes are now used routinely for both small- and large- 
scale sequencing projects. A common format entails using 
dideoxynucleotides that are each labeled with a different fluo¬ 
rescent dye, i.e., a one-lane, four-color detection system. In 
this case, the dideoxynucleotide reaction products are gener¬ 
ated using a thermostable polymerase and a PCR-based 
approach, mixed, and then separated in a single lane of a poly¬ 
acrylamide gel or a polymer-filled capillary tube. The sequence 
of fluorescent signals is recorded and converted to the corre¬ 
sponding nucleotide sequence. 

Pyrosequencing entails correlating the release of pyrophos¬ 
phate, which is recorded as the extent of an emission of light, 
with the incorporation of a particular nucleotide into a 
growing DNA strand. Pyrophosphate is enzymatically com¬ 
bined with adenosine-5'-phosphosulfate to form ATP, which 
in turn acts as a substrate for luciferase, which causes the 
emission of light. Sequencing using reversible chain termina¬ 
tors also reveals the sequence of a DNA fragment by detecting 
single-nucleotide extensions; however, in contrast to pyrose¬ 
quencing, the 4 nucleotides are added to the reaction together 


in each cycle, and after the unincorporated nucleotides are 
washed away, the nucleotide incorporated by DNA poly¬ 
merase is distinguished by its fluorescent signal. The fluores¬ 
cent dye and a blocking group that prevents addition of more 
than 1 nucleotide during each cycle are chemically cleaved, 
and the cycle is repeated. In another method, short sequences 
can be determined by ligating degenerate oligonucleotides 
that have a known nucleotide at the query position to an 
anchor primer in a DNA template-dependent fashion. The 
identity of the nucleotide in the query position is determined 
by the corresponding fluorescent signal. 

Traditionally, large-scale DNA sequencing of genomes has 
relied on the four-color Sanger dideoxynucleotide method 
and instruments that run 94 or 364 capillary gels in parallel. 
The Sanger method generates sequence lengths from 600 to 
800 bases, and effective software exists for assembling long 
contigs from the sequence data. Also, mate pairs and other 
strategies facilitate the ordering of the contigs. Finally, gaps 
are closed by various methods, such as primer walking, to 
produce a finished sequence. Although the whole-genome 
shotgun strategy has been the method of choice for sequencing 
genomes, novel, massively parallel DNA-sequencing plat¬ 
forms have been developed to significantly reduce the cost of 
sequencing and to enhance the speed of sequence data acqui¬ 
sition. The ostensible objective of these "next-generation" 
technologies is the $1,000 genome, which denotes the antici¬ 
pated cost of sequencing a human genome. To date, massive 
parallelization has been achieved with cyclic array-sequencing 
approaches that array clusters, each containing millions of 
copies of a single PCR-amplified sequence, on a surface and 
then acquire hundreds of millions of sequences simultane¬ 
ously using one of the non-Sanger sequencing methods 
described above. In addition, assemblers have been developed 
that efficiently and accurately construct contigs from data 
based on short nucleotide reads from about 25 to 200 nucle¬ 
otides in length. Various strategies for sequencing single DNA 
molecules, such as nanopore DNA sequencing, are in the early 
stages of development. 
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REVIEW QUESTIONS 

1. If your new DNA synthesizer has an average coupling 
efficiency of 98.5%, what overall synthesis yield would you 
expect after the synthesis of a 50-mer DNA hybridization 
probe? 

2. Suggest two different strategies for synthesizing a 0.5-kb 
gene. Discuss the advantages and disadvantages of both 
methods. 

3. What is a linker? How is it used? 

4. What are the essential components of a PCR used to 
amplify a specific sequence of DNA? 

5. Outline the steps in a PCR cycle. 

6. Discuss the significance of long templates and short tem¬ 
plates and their prevalence as the number of cycles increases 
during PCR. 

7. Discuss how PCR is used to synthesize a gene. 

8. What is a primer? What are the key requirements of an 
effective primer? Describe some of the techniques that 
require primers. 

9. What is a dideoxynucleotide? How is it used to determine 
the sequence of a DNA molecule? 


10. Why is it necessary to make DNA single stranded before 
determining its sequence? 

11. Draw the autoradiograph derived from dideoxynucle¬ 
otide sequencing of CCTGATCTTAGCCAT. 

12. Outline the basic features of pyrosequencing. 

13. How are incorporated nucleotides recognized after each 
cycle of sequencing by synthesis using reversible chain ter¬ 
minators? How does this differ from pyrosequencing? 

14. How are short sequences of a DNA fragment determined 
using oligonucleotides and DNA ligase? 

15. Describe the basic features of whole-genome shotgun 
sequencing. 

16. What are the advantages of cycle array sequencing com¬ 
pared to the shotgun cloning method for large-scale 
sequencing? 

17. Describe how a million or more copies of the same DNA 
molecule are attached to a single bead. 



Functional Genomics 

DNA Microarray Technology 
Serial Analysis of Gene Expression 

Proteomics 

Separation and Identification of 
Proteins 

Protein Expression Profiling 
Protein Microarrays 
Protein-Protein Interaction Mapping 

SUMMARY 



Bioinformatics, Genomics, 
and Proteomics 


Molecular Databases 

S CIENTIFIC INFORMATION IS GENERALLY PRESENTED for expert scrutiny in 
peer-reviewed articles that are published in professional periodicals. 
However, in the mid- to late 1980s, molecular biology journals were 
devoting more and more pages to DNA, RNA, and protein sequences 
derived from individually cloned genes. Moreover, anyone who wanted to 
conduct comparative analyses of nucleotide or amino acid sequences 
among related genes or proteins, as well as other kinds of analyses, had the 
unenviable task of typing out all the sequences from the relevant publica¬ 
tions. The GenBank database was established in 1982 in anticipation of the 
increasing availability of DNA sequences. Its purpose was the collection, 
management, storage, and distribution of sequence data. At first, submis¬ 
sions to GenBank were sporadic, but almost complete compliance was 
achieved when many journals made database submission a prerequisite for 
publication. From that point on, the database became an integral part of 
sequence-based research. Initially, access to the GenBank database was 
through servers that were linked to NSFnet (National Science Foundation 
Network). By current standards, the original GenBank interface was prim¬ 
itive, and the download times were interminable. From 1990 to 1995, large- 
scale projects, such as genetic and physical mapping of the human genome, 
partial sequencing of thousands of complementary DNAs (cDNAs) 
(expressed sequence tags [ESTs]), and sequencing of entire genomes, 
required additional databases and the expansion of the existing databases 
for storing and retrieving information. NSFnet was replaced in 1995 by the 
Internet (World Wide Web). Thereafter, submissions, access, and especially 
retrieval (data mining) of stored molecular information became rapid and 
easy. The visual aspects of the online sites improved dramatically, and links 
among relevant databases were established. Ready access through the 
Internet led to the creation of specialized databases for gene-specific muta¬ 
tions, regulatory sequences, mitochondrial genes and functions, specific 
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human genetic diseases, protein-protein interactions, protein structures, 
gene expression data, and many other types of data (Table 5.1). 

Generally, the area of research that generates, analyzes, and manages 
the mountains of information about genome sequences and all the genes 
that are transcribed in various cell types and tissues has been designated 
genomics. Studies of the entire protein populations of various cell types 
and tissues and the numerous protein-protein interactions has been 
dubbed proteomics. As new methods were implemented and research tar¬ 
gets became more focused, other "omics," such as metagenomics, func¬ 
tional genomics (transcriptomics), and metabolomics, emerged. Generally, 
the suffix "omics" implies large-scale, whole-genome experimentation, 
with the analysis of many samples at one time. As a consequence of this 
high throughput, there is heavy reliance on computers and computer pro¬ 
grams for assembling, analyzing, archiving, and distributing genomic data. 
Diverse computer resources have been developed to handle the various 
kinds of genomic information. 

The amount of information generated from massively parallel experi¬ 
mental systems is huge. For example, since its inception, the sequence 
information in GenBank has been doubling every 18 months (Fig. 5.1). 
More specifically, after 25 years, GenBank contained more than 80 billion 
nucleotide bases from about 80 million sequences derived from more than 
100,000 different organisms. Currently, there are more than 900 molecular 
databases that range from major DNA and protein sequence repositories 
(e.g., GenBank, Ensembl, UCSC Gene Browser, Genome Database, Universal 
Protein Resource, and the Ribosomal Database Project) that provide 
genomic and proteomic data to bibliographic and informational resources 
(e.g., OMIM, PubMed, RefSeq, and Gene Clinics). 

Clearly, the proliferation of molecular databases would not have been 
possible without either high-speed computers or the programmers who 
develop the means whereby the information can be used by the scientific 
community. Because of its distinctive nature, the extensive use of computers 
for storage and analysis of molecular data has become known as bioinfor¬ 
matics. Broadly speaking, bioinformatics is the development and applica¬ 
tion of computational tools for the submission, storage, organization, 
archiving, acquisition, analysis, and visualization of biological and medical 
data. Database contents are routinely accessed through Internet sites that 
have tutorials (e.g., http://www.ncbi.nlm.nih.gov/Education/) with 
instructions for accessing the information and explanations of how to use 
the software tools for sequence similarity searches, gene prediction, and 
many other kinds of analyses. Most molecular databases are designed to 
meet the needs of researchers and are not meant for curious visitors. There 
are, however, many useful websites that provide overviews of genomics, 
proteomics, bioinformatics, and other related topics. 

In sum, the adage "necessity is the mother of invention," which was 
coined by Richard Franck (1624-1708), aptly encapsulates how bioinfor¬ 
matics has advanced both genomic and proteomic research. Programs are 
written to analyze and visualize the experimental data, and accessible data¬ 
bases have been developed that are augmented with additional descriptive 
features (annotations) and linked to other sources of data to combine as 
much relevant material as possible. Information is now accessible that 


TABLE 5.1 Types of molecular databases 

Bibliographic resources 
Cellular regulation 
Chromosome aberrations 
Comparative genomics 
Gene expression 

Gene identification and structure 

Gene mutation 

Gene sequences 

Genetic and physical maps 

Genetic disorders 

Genomic variants 

Genomic sequences 

Intermolecular interactions 

Metabolic pathways 

Protein motifs 

Protein sequences 

Protein structure 

Proteome resources 

RNA sequences 

Transgenic organisms 
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FIGURE 5.1 Growth of the base pair content and sequence records in the GenBank 
DNA database. Note that the y axes are logarithmic scales. 


would have been impossible to obtain in the past, and importantly, novel 
studies can be contemplated that previously were considered impossible. 


Metagenomics 

For more than 100 years, the identification of microorganisms and the char¬ 
acterization of their biological functions have required cultivating each 
strain in the laboratory. In the 1990s, with the emergence of techniques to 
extract DNA directly from environmental samples, such as soil and sea¬ 
water, researchers began to examine the sequence diversity of microorgan¬ 
isms using the universal 16S ribosomal RNA gene as a taxonomic marker. 
These studies revealed that less than 1% of all bacterial species could be 
cultured, and therefore, novel genes that might be of considerable interest 
for basic and applied research were inaccessible using methods that 
depended on growth of bacteria in the laboratory. Considering the wealth 
of biotechnologically important genes and proteins that had been obtained 
from the relatively few culturable microorganisms, the possibility of har¬ 
vesting useful genes from the much greater number of unculturable micro¬ 
organisms was exciting, if not daunting. With the development of 
high-efficiency cloning, robotic workstations that handle thousands of 
transformed cells, inexpensive DNA sequencing, microbial DNA sequence 
databases, and bioinformatics resources for processing, standardizing, and 
storing information, it has become possible to access the genomes of uncul¬ 
tured organisms from environmental and clinical samples. The study of the 
collective genomes in these samples is known as metagenomics. 

The primary objective of a metagenomic project is to construct a com¬ 
prehensive DNA library from all the microorganisms of a particular eco¬ 
system or location (Table 5.2). The metagenomic clones can be characterized 
in various ways (Fig. 5.2). One tactic entails sequencing the entire library 
using the shotgun sequencing strategy with the aim of assembling contig- 





Bioinformatics, Genomics, and Proteomics 


149 


uous sequences of DNA (contigs) from as many different genomes as pos¬ 
sible and identifying both novel and homologous gene sequences. For 
example, a massive study that included 50 ocean samples from locations in 
the North Atlantic through the Panama Canal to the South Pacific yielded 
6.3 x 10 9 bases of sequence. Analysis of the assembled and nonassembled 
sequences indicated that there might be as many as 400 new bacterial spe¬ 
cies among the samples, with about 1 x 10 6 genes that lack significant 
sequence similarity with any known gene. The analysis also revealed 
sequences encoding potentially novel forms of many proteins, including 
proteins for repair of ultraviolet light-induced DNA damage and RuBisCO 
(ribulose bisphosphate carboxylase), an enzyme that is important for 
carbon fixation. 

Sequence-based metagenomic projects are especially effective with 
microbial communities that have relatively few species. For example, some 
bacteria are able to thrive on pyrite (iron sulfide) ore sediments and are 
associated with extremely acidic runoff from metal and coal mines (acid 
mine drainage). Not only is this acidic water (often pH <1.0) harmful to 
aquatic and terrestrial ecosystems, but it leaches out environmentally haz¬ 
ardous heavy metal contaminants, such as copper, lead, zinc, and cadmium, 
from the ore sediments and mine tailings. The toxic runoff often continues 
long after the mining operation has been abandoned. Thus, there is consid¬ 
erable interest in learning more about the metabolic pathways of the micro¬ 
organisms found in these environments and how they survive under such 
conditions. This information may contribute to more effective control mea¬ 
sures for curtailing the production of acid. In one metagenomic study of an 
abandoned mine site in California, the nearly complete genomes of the two 
major bacterial species (Leptospirillum group II and Ferroplasma type II) and 
partial genomes of three other microbes were cloned and assembled. 
Although more research is required to determine the metabolic dynamics of 
this microbial community, gene assignments for the different genomes indi¬ 
cated that a rare member of the assemblage, Leptospirillum group III, plays a 
critical role. It may be the only organism that fixes atmospheric nitrogen in 
this environment, and as a result, it acts as the primary source of nitroge¬ 
nous compounds and ammonia for the other microorganisms. 

Metagenomic libraries are frequently screened for enzyme activity to 
identify novel enzymes with biotechnological potential (Fig. 5.2). Selection 
for growth of transformed Escherichia coli cells on particular substrates, 
complementation tests, and, most often, simple indicator systems are used 


TABLE 5.2 Sources of some metagenomic libraries 

Abandoned metal and coal mines 
Arctic plankton 
Freshwater lakes and rivers 
High-temperature springs and mudholes 

Intestinal microbial communities from insects, humans, and mice 

Marine sponges 

Oceans 

Sediments of all types 
Sewage sludge 
Soils of all types 
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FIGURE 5.2 Construction of metagenomic libraries from various sources. Bacteria 
from marine or freshwater samples are concentrated by using a 0.1- to 0.8-gm-pore- 
size filter, and viruses are concentrated by using a <0.1-|tm filter before the DNA is 
extracted and then fragmented. Libraries containing small DNA fragments, inserted 
into a plasmid, are screened for novel genes, while libraries containing larger DNA 
fragments, inserted into a fosmid, cosmid, or bacterial artificial chromosome vector, 
are screened for operons encoding protein complexes or metabolic pathways. 


for these studies. In one example, a metagenomic library was screened for 
cloned lipase genes by growing transformed cells on agar plates that were 
supplemented with various triglyceride substrates, such as tricaprylin, and 
isolating colonies that were surrounded by a clear zone, i.e., a halo. The halo 
indicated that the colony produced and secreted an enzyme that digested 
tricaprylin. Relatively large DNA fragments (typically between 5 and 40 
kilobase pairs [kb]) must be cloned to ensure that all genes encoding pro¬ 
teins in the pathway for catabolism of a substrate, often encoded in a poly- 
cistronic operon, are present in a transformed cell. These function-based 
metagenomic projects have identified a myriad of enzymes (Table 5.3). 

The availability of robotic systems for maintaining and transferring 
transformed cells has been extremely helpful for expression-based studies, 
because on average, about 10 4 to 10 6 metagenomic clones must be screened 
to detect one or two positive colonies. However, there is an inherent limita- 
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tion with the host cell, usually E. coli, for selection schemes that depend on 
transcription and translation of the cloned gene. Computer modeling using 
codon usage and other transcription and translation features from the 
genes of many different microorganisms suggests that only 40% of the het¬ 
erologous genes will be expressed in E. coli. Consequently, to increase the 
likelihood of identifying additional novel genes, broad-host-range vectors 
and other host cells are being used for constructing and maintaining meta- 
genomic libraries. 

Specialized gene expression systems aid in detecting metagenomic 
clones that carry genes with certain functions. One example of this type of 
strategy has been called substrate-induced gene expression (SIGEX) 
screening (Fig. 5.3). As the name suggests, this procedure identifies catabolic 
genes that are expressed when their promoters are activated in the presence 
of particular substrates and relies on the cloning of regulatory elements that 
are often found upstream of the catabolic genes that they control. The 
system utilizes a vector that contains the green fluorescent protein (gfp) gene 
under the control of the lac promoter (p ,ac ) in a pUC-based plasmid, desig¬ 
nated pl8GFP (Fig. 5.3A). The cloning site lies between the lac promoter and 
the gfp gene. DNA from a microbial community is fragmented and cloned 
into pl8GFP. The cells are grown in the presence of ampicillin, to prevent 
the growth of untransformed cells, and IPTG (i sop ropy l-p-D-th i o- 
galactopyranoside), which induces the expression of green fluorescent pro¬ 
tein from the lac promoter. Cells that produce green fluorescent protein in 
the presence of IPTG are those that carry plasmids without inserts (i.e., self- 
ligated plasmids), plasmids with inserts that do not prevent transcription of 
gfp from the lac promoter (for example, the insert does not contain a tran¬ 
scriptional terminator), or plasmids with inserts containing promoters that 
are constitutively active. Transformed cells of interest in this procedure are 
those that do not produce green fluorescent protein in the presence of IPTG 


TABLE 5.3 Some of the gene-encoded enzymes identified in 
metagenomic libraries 

a-Amylases 

Antibiotic resistance enzymes 

Antibiotic synthesis enzymes (e.g., polyketide synthases) 

Aromatic hydrocarbon catabolic enzymes 

Cellulases 

Chitinases 

Dehydrogenases 

1,4-a-Glucan branching enzymes 

Lipases, esterases 

Nitrilases 

Oxidoreductases 

Pectin lyases 

Proteases 

Vitamin biosynthesis enzymes 
Xylanases 
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because they carry plasmids with inserts that do not induce expression of 
gfp under these conditions. To remove cells that produce green fluorescent 
protein in the presence of IPTG, the transformed cells are subjected to fluo¬ 
rescence-activated cell sorting (FACS). Briefly, FACS consists of streaming 
cells in single file past a laser beam that detects the excitation of a fluoro- 
chrome that is either attached to or, as in this example, inside the cell and 
sorts fluorescent and nonfluorescent cells into separate collection vessels 
(Fig. 5.3B). Accordingly, with SIGEX screening, the cells with green fluores¬ 
cent protein, which fluoresces green when exposed to blue light, are sepa¬ 
rated from the cells that do not synthesize green fluorescent protein and, 
therefore, do not fluoresce. 

The green fluorescent protein-negative cells are then grown in the pres¬ 
ence of a low-molecular-weight substrate, for example, benzoate. The pur¬ 
pose of this step is to identify the metagenomic clones that carry cloned 
DNA segments that are required for activation of catabolic genes by the 
interaction of the target substrate, e.g., benzoate. Regulatory elements, 
including genes encoding regulatory proteins and the DNA elements that 
they bind to in the presence of the target substrate, are usually close to the 
promoter of the catabolic operon, and therefore, at least a portion of the 
catabolic operon is likely to also be present on the cloned segment. If a 
catabolic operon is activated by the substrate and transcription continues 
through the gfp gene, then green fluorescent protein will be produced. 
Consequently, a second round of FACS is carried out, and the cells that 
express green fluorescent protein in the presence of the substrate are 
retained (Fig. 5.3B). This procedure enables rapid, high-throughput 
screening of a metagenomic library with various substrates. An initial test 
of SIGEX with a groundwater metagenomic library yielded seven different 
operons that were induced by benzoate and two induced by naphthalene. 

Bioinformatics systems are being developed to handle the vast amount 
of data derived from metagenomic projects. For example, an Internet 
resource that has been called CAMERA, which stands for Community 
Cyberinfrastructure for Advanced Marine Microbial Ecology and Analysis 
(http:/ /camera.calit2.net), provides extensive sequence and ecological data¬ 
bases, as well as the computational tools for analyzing marine metagenomic 


FIGURE 5.3 SIGEX system for isolating catabolic genes from metagenomic libraries. 
(A) A SIGEX vector, designated pl8GFP, contains the gfp gene encoding green fluo¬ 
rescent protein (GFP) under the control of the IPTG-inducible promoter p ,ac (top 
plasmid). Metagenomic DNA fragments are inserted into the multiple cloning site 
(MCS) between p ,ac and gfp. If an inserted fragment contains a constitutive promoter 
that is active in the host cell and drives expression of gfp or a sequence that does not 
disrupt the activity of p ,ac (e.g., a sequence carrying a transcriptional terminator), 
then the cells carrying these constructs will emit green fluorescence in the presence 
of the inducer, IPTG. If the insert carries a promoter and regulatory elements that 
are activated only in the presence of a substrate of interest, e.g., benzoate, and drive 
gfp expression, then these cells will emit green fluorescence when the substrate is 
added. ND, not determined because cells that express gfp in the presence of IPTG 
are removed in the first SIGEX screening step. (B) Metagenomic libraries are pre¬ 
pared in pl8GFP and initially screened using FACS to remove clones that emit 
green fluorescence in the absence of a catabolic substrate. Nonfluorescent cells are 
collected, grown in the presence of a substrate of interest, and then screened again 
by FACS to identify clones that carry substrate-inducible regulatory elements. The 
inserts from clones that emit green fluorescence only in the presence of the substrate 
are analyzed to identify full or partial sequences encoding catabolic enzymes. 
Additional experiments may be required to isolate entire catabolic operons. 
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MILESTONE 


Multiplexed Biochemical Assays with Biological Chips 

S. P. Fodor, R. P. Rava, X. C. Huang, A. C. Pease, C. P. Holmes, 
and C. L. Adams 
Nature 364:555-556,1993 

Quantitative Monitoring of Gene Expression Patterns 
with a Complementary DNA Microarray 

M. Schena, D. Shalon, R. W. Davis, and P. O. Brown 
Science 270:467-470,1995 


F odor et al. described the basic 
features of an oligonucleotide 
microarray fabricated by photo¬ 
lithography. In the words of the 
authors, "Recently, we have devel¬ 
oped new technologies to synthesize 
and assay biological molecules in a 
miniaturized format. The method uses 
light to direct the combinatorial chem¬ 
ical synthesis of biopolymers on a 
solid support. The identity and loca¬ 


tion of each biopolymer is known, and 
its interaction with a molecular 
binding agent can be measured. These 
miniature biological arrays, or chips, 
can then be used for a variety of mul¬ 
tiplexed biochemical assays...." In a 
later article (S. P. A. Fodor, Science 
277:393-395,1997), Fodor noted that 
"the applications [of oligonucleotide 
microarrays] appear to be only limited 
by imagination." Over a short period. 


DNA microarrays have become a stan¬ 
dard component in the tool kits of 
both molecular biologists and biotech¬ 
nologists. 

Schena et al. showed the utility of a 
robotic printed (spotted) cDNA 
microarray for analyzing gene expres¬ 
sion on a large scale. By current stan¬ 
dards, a 45-cDNA microarray seems 
modest. However, in 1995, it was a 
spectacular demonstration of a new 
technology. 

In sum, although not necessarily 
the first articles using or describing 
multiple assay formats, the Fodor et 
al. and Schena et al. publications 
showed that DNA microarray plat¬ 
forms had come of age. 


libraries. Researchers will also be able to correlate species abundance and 
gene frequencies with environmental and physicochemical information for 
a better appreciation of the dynamics of marine microbial communities. In 
a broader context, algorithms have been developed to detect prokaryotic 
gene sequences, to recognize DNA sequences that are specific for particular 
microbial species, and to distinguish members of known gene families. In 
sum, the metagenomics approach has begun to reveal an immense amount 
of information about the vast microbial world that barely a decade ago was 
considered beyond reach. 


Functional Genomics 

Genomics encompasses the study of all features of genomes and individual 
genes at the DNA level, including mutations, polymorphisms, and phylo¬ 
genetic relationships that are based on sequence differences. Another 
aspect of genomics that is often called functional genomics (or transcrip- 
tomics) is concerned with the patterns of transcription, either qualitatively 
to determine which genes are expressed or quantitatively to measure 
changes in the levels of transcription of genes. Transcription at the whole- 
genome level is assessed as a function of clinical conditions, as a conse¬ 
quence of mutations, in response to natural or toxic agents, in different cells 
or tissues, or at different times during biological processes, such as cell 
division or the development of an organism. One of the aims of gene 
expression studies is to discover the genes that are up- and downregulated 
under specific conditions (the transcriptome). In the past, the transcription 
of only one or a few genes could be followed at a time. Currently, func¬ 
tional genomics methodology can track the simultaneous transcription of 
thousands of genes (gene expression profiling) of either a cell or a tissue 
sample. The main experimental approaches for determining gene expres- 
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sion profiles are DNA microarrays and serial analysis of gene expression 
(SAGE). Because of the large amount of data that is generated from these 
experiments, special computational tools are required for obtaining, 
storing, and analyzing the results. 


DNA Microarray Technology 

A DNA microarray (DNA chip, or gene chip) experiment consists of hybrid¬ 
izing a nucleic acid sample (target) derived from the messenger RNAs 
(mRNAs) of a cell or tissue to single-stranded DNA sequences (probes) that 
are bound in an ordered arrangement to a solid platform. One type of DNA 
microarray is constructed by spotting polymerase chain reaction (PCR)- 
amplified cDNA sequences from the mRNAs of a single cell or all or specific 
sets of the coding sequences of an organism onto a glass slide or nylon 
membrane. In this case, about 10,000 different probes can be arrayed in a 
1-cm 2 area. 

An alternative microarray system utilizes sets of oligonucleotides as 
probes, usually representing thousands of genes. For one commonly used 
platform, the probes are synthesized directly (in situ) on a solid surface 
(quartz wafer) by a light-directed process known as photolithography. 
Thousands of copies of an oligonucleotide with the same specific nucle¬ 
otide sequence are synthesized in a predefined position (probe cell or fea¬ 
ture) on the array surface. For this type of microarray, the probes are 
typically 10 to 40 nucleotides, and several probes with different sequences 
for each gene will be synthesized on the microarray. Longer oligonucle¬ 
otides up to 100 nucleotides can also be used. A complete whole-genome 
oligonucleotide array may contain more than 500,000 probes representing 
as many as 30,000 genes. 

Generally, the design of the probes (probe set) for a microarray depends 
on the objective of the experiment and the degree of resolution that is 
required. Computer programs determine probe sequences that are specific 
for their target sequences, are least likely to hybridize with nontarget 
sequences (cross-hybridize), have no secondary structure (foldback) that 
would prevent hybridization with the target sequence, and have similar 
melting (annealing) temperatures, so that all target sequences can bind to 
their complementary probe sequences under the same conditions. 
Oligonucleotide microarrays may consist of probes that represent an entire 
genome, a single chromosome, selected genomic regions, or selected 
coding regions from one or several different organisms. Repetitive sequences 
are not included in genomic DNA microarrays. 

Typically, for most gene expression profiling experiments that utilize 
microarrays, mRNA is extracted from cells or tissues and purified, and 
cDNA is synthesized using reverse transcriptase and the extracted mRNA 
as a template. Usually, mRNA is extracted from two or more sources whose 
expression profiles are compared; for example, from diseased versus 
normal tissue or from cells grown under different conditions. The cDNA 
from each source is labeled with a different fluorophore by incorporating 
fluorescently labeled nucleotides during cDNA synthesis. For example, a 
green-emitting fluorescent dye (Cy3) is used for the normal (reference) 
sample and a red-emitting fluorescent dye (Cy5) for the test sample (Fig. 
5.4). After being labeled, the cDNA samples are mixed and hybridized to 


FIGURE 5.4 Gene expression profiling 
with a DNA microarray. mRNA is 
extracted from two samples (sample 1 
and sample 2), and during reverse tran¬ 
scription, the first cDNA strands are 
labeled with the fluorescent dyes Cy3 
and Cy5, respectively. The cDNA sam¬ 
ples are mixed and hybridized to an 
ordered array of either gene sequences 
or gene-specific oligonucleotides. After 
the hybridization reaction, each probe 
cell is scanned for both fluorescent 
dyes and the separate emissions are 
recorded. Probe cells that produce only 
a green or red emission represent genes 
that are transcribed only in samples 1 
and 2, respectively; yellow emissions 
denote genes that are active in both 
samples; and no emissions (black) rep¬ 
resent genes that are not transcribed in 
either sample. 
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FIGURE 5.5 Fluorescence image of a 
DNA microarray hybridized with Cy3- 
and Cy5-labeled cDNA. Reproduced 
from http://biotech.biology.arizona. 
edu/Resources/DNA_analysis.html. 
Courtesy of N. Anderson, University of 
Arizona. 


the same microarray. A laser scanner determines the intensities of Cy5 and 
Cy3 for each probe cell. The probe cells have different colors depending on 
the amounts of Cy3 and Cy5 that are present, and the ratio of red (Cy5) to 
green (Cy3) fluorescence intensity of a probe cell indicates the relative 
expression levels of the represented gene in the two samples (Fig. 5.5). To 
avoid variation due to inherent and sequence-specific differences in 
labeling efficiencies between Cy3 and Cy5, reference and test samples are 
often reverse labeled and hybridized to another microarray. In the above 
example, reverse labeling (dye swapping) would entail labeling the refer¬ 
ence sample with Cy5 and the test sample with Cy3. Alternatively, for some 
microarray platforms, the target sequences from each source are labeled 
with the same fluorescent dye, and reference and test samples are hybrid¬ 
ized to different microarrays. 

In an alternative strategy, mRNAis purified with an oligo(dT) sequence 
that binds to the poly(A) tail of eukaryotic mRNA and has a short extension 
(T7 primer) containing the sequence for the bacteriophage T7 RNA poly¬ 
merase promoter (Fig. 5.6). The oligo(dT)-T7 sequence primes the synthesis 
of cDNA from mRNA using reverse transcriptase. Synthesis of the second 
DNA strand using DNA polymerase results in double-stranded cDNA that 
contains the T7 RNA polymerase promoter. Next, T7 RNA polymerase is 
used to produce RNA copies (complementary RNA [cRNA] or antisense 
RNA) from the second cDNA strand as a template in the presence of biotin- 
labeled ribonucleotides. This reaction results in linear amplification and 
biotinylation of cRNA, which is then fragmented into pieces from 50 to 100 
nucleotides in length that are optimal for hybridization. After hybridiza¬ 
tion, the microarray is treated with streptavidin that is bound to the fluo¬ 
rescent protein phycoerythrin. Streptavidin binds specifically to the biotin 
residues of hybridized cRNA, and hybridization can be detected by emis¬ 
sions from phycoerythrin that are elicited during laser scanning. 

Because of the vast amount of data generated by microarray experi¬ 
ments, specialized software has been developed to maximize the output of 
information. Analyses of two-dye and one-dye hybridized microarrays are 
similar. A common method by which information is extracted from two- 
dye hybridized microarrays is summarized here. Each probe cell of a two- 
dye hybridized microarray is scanned using a confocal scanning 
microscope. Following laser excitation of the dye, fluorescence emitted 
from each probe cell, detected at both 532 and 635 nm for Cy3 and Cy5, 
respectively, is collected through the microscope's objective lens and con¬ 
verted to an electrical signal via a photomultiplier tube. The intensities of 
fluorescence emitted by both dyes for each probe cell, along with back¬ 
ground readings for the microarray, are recorded and stored. Background 
fluorescence is determined by measuring the fluorescence from blank 
areas where probe cells have not been spotted and is subtracted from the 
fluorescence intensities measured for each probe cell. Microarrays are 
designed with internal controls, that is, specific probes that are used to 
evaluate the reliability of the hybridization procedure and to ensure that 
the laser scanner performed properly. At this stage, the microarray data 
(i.e., the collection of fluorescence intensities of each probe cell) is normal¬ 
ized to correct for variations (systematic errors) caused by technical factors 
that contribute to the fluorescence intensities of a probe cell and enable 
comparison among the microarrays of an experiment. To minimize errors, 
multiple probe cells for each gene are included on a single microarray, and 
replicate samples are independently prepared under the same conditions 
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FIGURE 5.6 Gene expression profiling 
with an oligonucleotide microarray. 
mRNA is purified with a poly(dT) 
sequence that has a T7 RNA polymerase 
primer sequence extension. After two- 
stranded cDNA synthesis, the second 
cDNA strand acts as a template for syn¬ 
thesis of cRNA by T7 RNA polymerase in 
the presence of ATP, cytidine triphos¬ 
phate (CTP), guanosine triphosphate 
(GTP), uridine triphosphate (UTP), bioti¬ 
nylated CTP, and biotinylated UTP. The 
gray circles represent incorporated bioti¬ 
nylated nucleotides. The biotinylated 
cRNA is purified, fragmented into pieces 
from 50 to 100 nucleotides in length, and 
hybridized to an oligonucleotide micro¬ 
array. The microarray is treated with 
streptavidin-phycoerythrin, and the 
probe cells (black squares) are scanned 
for emission (yellow) from the biotin- 
bound streptavidin-phycoerythrin. 
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and hybridized to different microarrays. Misleading interpretations are 
caused, in part, by variations in sample preparation, such as growth condi¬ 
tions, RNA extraction, cDNA synthesis and labeling efficiencies, differ¬ 
ences in efficiencies of hybridization of the target sequences among 
replicate microarrays or across a single microarray, variations in the con¬ 
centrations of probes on different microarrays, or unequal amounts of 
target sequences applied to different microarrays or unequal distribution 
of targets on a single microarray. Several methods for normalization are 
used to calibrate the data among replicate microarrays, such as using the 
fluorescence intensity of a gene that is not differentially expressed under 
different conditions as a reference point, including spiked control sequences 
that are sufficiently different from the target sequences and therefore bind 
only to a corresponding control probe cell, and adjusting the total fluores¬ 
cence intensity for each microarray to a similar value under the assump¬ 
tion that a relatively small number of genes are expected to change under 
different conditions. 

One of the major purposes of a microarray experiment is to identify 
genes whose expression changes in response to a particular biological con¬ 
dition. The response to a biological condition is determined by comparing 
the fluorescence intensity for each gene (each probe cell), averaged among 
replicates, under two different conditions and calculating the ratio, com¬ 
monly expressed as an n-fold change. For effective comparisons, the raw 
data of the dye emissions of each probe cell of a microarray are often con¬ 
verted to log 2 ratios (Table 5.4). The sign indicates the dye with the higher 
intensity. Generally, positive log ratios represent more Cy5 than Cy3 and, 
therefore, greater expression of the gene in the test sample than in the refer¬ 
ence sample. Negative values (more Cy3 than Cy5) indicate a lower level 
of expression in the test sample than in the reference sample. The log ratios 
for all probe cells are compiled into a table called an expression matrix. 

Microarray analysis is also used to identify genes that are coexpressed 
under different conditions or over a period of time, with the goal of deter¬ 
mining which gene products function in a given pathway. A number of 
computational strategies are available that organize the data into related 
groups (clusters). For a clear presentation of the clustered data, ranges of 
log ratio values are assigned arbitrary colors. Usually, black is designated 
for a log ratio of zero, dark to bright red for increasing positive log ratios, 
and dark to bright green for decreasing negative log ratios. In other words, 
red is frequently used to denote gene overexpression and green to denote 
underexpression. A visualized representation of a clustered microarray is 
called a gene expression profile, where the rows represent the reordered 
genes and the columns represent either conditions or samples (Fig. 5.7). 

The gene expression profile in Fig. 5.7, determined by microarray 
analysis, clearly shows that different genes are transcribed in patients with 
cirrhosis of the liver than in healthy individuals and in patients with eth¬ 
anol-induced cirrhosis than in those with cirrhosis induced by the hepatitis 


TABLE 5.4 Converting Cy3 and Cy5 intensities to log 2 ratios 


Feature (gene) 

Cy3 intensity 

Cy5 intensity 

Cy5/Cy3 

l°g 2 (Cy5/Cy3) 

1 

180 

10,000 

56 

+5.81 

2 

5,400 

5,400 

1 

0 

3 

8,400 

400 

0.05 

-4.39 
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FIGURE 5.7 Gene expression profile of cirrhotic liver tissue. Columns 1 to 7 and 8 to 
15 are liver samples from patients with ethanol- and hepatitis C virus-induced cir¬ 
rhosis of the liver, respectively. Each patient's sample was compared to normal liver 
tissue. The cluster consists of 2,965 genes. The asterisks denote patients with severe 
cirrhosis of the liver. Adapted from Lederer et al., Virol. ]. 3:98, 2006. 


FIGURE 5.8 Gene expression profile of lymphocyte-specific genes from cirrhotic 
liver tissue. Columns 1 to 7 and 8 to 15 are liver samples from the patients shown 
in Fig. 5.7 with ethanol- and hepatitis C virus-induced cirrhosis of the liver, respec¬ 
tively. Each patient's sample was compared to normal liver tissue. The cluster con¬ 
sists of about 70 genes. The asterisks denote patients with severe cirrhosis of the 
liver. Adapted from Lederer et al., Virol. J. 3:98, 2006. 
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C virus. Moreover, there is a difference between the genes that are turned 
on during advanced ethanol-induced liver damage and those in patients 
with less severe ethanol-induced cirrhosis (Fig. 5.7). No such distinction is 
evident among individuals with different severities of virus-induced cir¬ 
rhosis (Fig. 5.7). In addition, information about the transcription of genes 
that contribute to a particular pathway or cellular activity can be extracted 
from a gene expression profile. For example, genes that are transcribed 
during lymphocyte proliferation and activation are highly expressed in 
virus-induced liver cirrhosis and to a much lesser extent in ethanol-associ¬ 
ated cirrhotic samples (Fig. 5.8). 

The importance and pervasiveness of DNA microarrays cannot be over¬ 
stated. In 2007, for example, there were more than 13,000 published journal 
articles that either used this technology or described methods for enhancing 
data analysis. Clinical applications for DNA microarrays are being devel¬ 
oped. The U.S. Food and Drug Administration granted permission in 2007 
for the first commercial diagnostic assay based on a DNA microarray. In this 
case, a 70-gene expression profile (MammaPrint) distinguishes between 
patients with breast cancer that is likely to migrate to other sites (metasta¬ 
size) and those whose cancer has a low risk of metastasis. The reliability and 
reproducibility of microarrays with different formats and from various labo¬ 
ratories have been major concerns. However, standards for both running 
microarray experiments and analyzing the data have been proposed by a 
number of international groups, which should alleviate these problems. 
Finally, it should be noted that in addition to gene expression studies, DNA 
microarrays are used to determine the binding sites for DNA-binding pro¬ 
teins (e.g., ChIP-on-chip assays, which use chromatin immunoprecipitation 
[ChIP] to identify proteins bound to a DNA microarray), the sites where the 
transcription of genes starts and stops, and many other aspects of genome 
architecture. This research has shown that a much larger proportion of the 
eukaryotic genome is transcribed than was previously thought; a number of 
genes have multiple start and termination sites, some of which are hun¬ 
dreds of kilobases from known sites for many genes; both strands of many 
genomic regions are transcribed; splicing occurs between RNA molecules; 
and some transcription factors bind to dozens of sites scattered throughout 
the genome. In short, whole-genome microarray analysis has revealed 
greater complexity in the processes that control transcription in a eukaryotic 
organism than could be predicted through smaller-scale transcriptional 
analyses. 

Serial Analysis of Gene Expression 

Unlike DNA microarrays that rely on hybridization and signal detection, 
SAGE uses recombinant DNA techniques to clone randomly linked short 
sequences of cDNA prepared from extracted cellular mRNA that can be 
efficiently sequenced to identify expressed genes (Fig. 5.9). Polyadenylated 
mRNA is captured by an oligo(dT) sequence that is labeled with biotin and 
attached to a streptavidin-coated magnetic bead. Double-stranded cDNA 
is synthesized from the purified mRNA using reverse transcriptase to syn¬ 
thesize the first strand of cDNA from the oligo(dT) primer and mRNA 
template and then DNA polymerase to synthesize the second, comple¬ 
mentary, strand. A strong magnet is used to retain the magnetic beads with 
attached cDNAs during successive treatments and washings. The cDNAs 
are cut with the restriction endonuclease Nlalll, which recognizes the 
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sequence gtac and cuts outside the G:C base pairs, leaving a 3' GTAC 
extension. Since Nlalll cuts, on average, 1 in 256 base pairs (bp), there is a 
high probability that each cDNA will have at least one Nlalll recognition 
site. Because the cDNAs are bound to beads, each Nlalll cut site that is 
closest to the 3' end of a cDNA is retained, and all unbound fragments are 
washed away. In the nomenclature developed for SAGE, Nlalll is called an 
anchoring enzyme. Next, the Nlalll-digested cDNA sample is divided in 
two. One aliquot is ligated with adaptor A and the other with adaptor B. 
Both adaptors have a CATG extension that is complementary to the exten¬ 
sion produced by Nlalll digestion, a 5-bp recognition site for the restric¬ 
tion endonuclease BsmFI, and a sequence for priming a PCR. The adaptors 
have different primer sequences to prevent intrastrand base pairing (snap 
back) during subsequent PCR steps. After the adaptors are ligated to 
Nlalll-digested cDNA, the products are treated with BsmFI. Unlike Nlalll 
and other type II restriction endonucleases, this type IIs restriction endo¬ 
nuclease cuts 10 nucleotides downstream from its recognition site in one 
DNA strand and 14 nucleotides in the other strand regardless of the inter¬ 
vening nucleotide sequence. In SAGE, BsmFI is known as a tagging 
enzyme, and the segment of cDNA produced by the BsmFI treatment is 
called a tag. BsmFI digestion releases the adaptor-tag molecules from the 
beads into solution, from which they can be recovered. The 4-nucleotide 
extension of the BsmFI-cut DNA is filled in to form a blunt-end molecule, 
the pools of adaptor A- and adaptor B-tags are mixed, and T4 DNA ligase 
is added to the mixture. Under these conditions, the blunt ends of two tags 
are joined to form a two-tag molecule (a ditag) that is flanked by primer 
sequences. Since ditag formation is completely random, tags from dif¬ 
ferent cDNAs are joined during ligation, and the ditags are readily ampli¬ 
fied during PCR using primer sequences present in the adaptors. The 
amplified ditags are treated with Nlalll to release the adaptor sequences 
and produce ditags with an Nlalll extension at each end. The Nlalll- 
digested ditags are ligated to form multiple, randomly joined combina¬ 
tions (concatemers) of ditags. Concatemers that are about 500 bp in length 
are isolated and cloned in an £. coli plasmid. The concatemers are 
sequenced, the sequence of each tag is recorded, and a specialized "tag to 
gene" database is searched to identify the corresponding gene. The 
sequenced tag is derived from the 3' end of the mRNA and therefore cor¬ 
responds to the 3' end of the gene. The number of times each tag is 
sequenced, which represents its abundance in the initial sample, is deter¬ 
mined. Up- and downregulated mRNAs can be identified by comparing 
the frequencies of tags in different samples. Generally, more than 10,000 
unique tags are collected from a single experiment. Over 30 million SAGE 
tags have been assembled from humans and another 35 million from var¬ 
ious organisms. In principle, SAGE can detect all the transcripts in a 
sample, whereas with a DNA microarray, only the sequences that corre¬ 
spond to probes on the array are identified. 

Additional SAGE protocols, such as LongSAGE and SuperSAGE, have 
been developed to produce longer tags (Table 5.5). Also, other anchoring 
enzymes, e.g., Sau3AI and other restriction endonucleases that recognize 
and cleave at specific 4-bp sequences (4-base cutters), have been used to 
identify transcribed genes that do not have a convenient Nlalll site. An 
online resource called SAGE Genie (http: / /cgap.nci.nih.gov/SAGE) is avail¬ 
able for matching tags to likely genes, determining the frequency of a tag 
among various SAGE libraries, and providing other pertinent information. 
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FIGURE 5.9 SAGE. Poly(A) + mRNA transcripts are isolated by poly(dT) hybridiza¬ 
tion. The poly(dT) sequence is biotinylated (B) and bound to streptavidin (SA)- 
coated magnetic beads (yellow circles), or the poly(dT) is attached directly to 
magnetic beads. Double-stranded cDNA is synthesized from the captured mRNAs 
and then cut with the restriction endonuclease Nlalll. The fragments of cDNA that 
are not bound to the magnetic beads are eluted. The Nlalll-cut cDNA sample that 
is attached to the magnetic beads is divided in two, and one sample is ligated with 
adaptor A and the other with adaptor B. Each adaptor contains a 4-base extension 
that is complementary to the 4-base extension produced by cleavage with Nlalll, a 
recognition site for the restriction endonuclease BsmFI, and its specific primer 
sequence (primer A and primer B). The ligated "adaptor-Nlalll-cut" molecules are 
treated with BsmFI, which cleaves the DNA 10 and 14 nucleotides downstream 
from its recognition site (open arrows). The extensions of the BsmFI fragments are 
filled in by DNA synthesis, and the mixture is blunt-end ligated. Some of the ligated 
molecules consist of two joined segments from different cDNAs (ditags) that are 
flanked by sequences for primers A and B. The ditags are amplified by PCR and 
then treated with Nlalll to remove the primer sequences and generate sticky ends. 
Ligation of Nlalll-cut ditags forms concatemers of various ditags. Concatemers 
with about 20 ditags (-500 bp) are purified and cloned into a plasmid vector. The 
concatemers are sequenced, and the individual tags are identified. The likely cor¬ 
responding gene is determined by a similarity search, and the frequency of each tag 
in the sample is recorded. 
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FIGURE 5.9 (continued) 


TABLE 5.5 Different SAGE systems 


System 

Tagging enzyme 

Recognition site 

Tag (bp) 

SAGE 

BsmFI 

GGGAC (N) 10 
CCCTG(N ) 14 

10-14 

LongSAGE 

Mmel 

TCCRAC (N) 21 
AGGYTG(N ) 19 

17-21 

SuperSAGE 

EcoP15I 

CAGCAG(N ) 25 
GTCGTC (N) 27 

26 
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Proteomics 

Proteomics is the comprehensive study of all the proteins, i.e., the pro- 
teome, of a cell, tissue, body fluid, or organism from a variety of perspec¬ 
tives, including structure, function, expression profiling, and protein-protein 
interactions. There are a number of good reasons to study the protein 
complement of cells or tissues. First, proteins comprise the active compo¬ 
nent of cells. They are the molecular machines that catalyze the synthesis 
of important metabolites and molecules, monitor the internal and external 
environment of the cell and mediate responses to environmental perturba¬ 
tions, and make up the structural components of cells. Thus, insight into 
the proteins that are present in a cell or tissue under particular biological 
conditions can aid in our understanding of the cell's activities. Although 
protein-coding sequences can often be identified in genomic sequences, 
some annotated open reading frames (ORFs) are subsequently found not to 
encode proteins, and others encode proteins whose functions cannot be 
predicted from the sequence. Furthermore, posttranslational modifications 
that influence the protein function and cellular localization of proteins 
often cannot be predicted from the sequence. On the other hand, a protein's 
function can sometimes be inferred by determining the conditions under 
which it is expressed and active. Although expression profiles of protein- 


BOX 5.1 


How Many Genes Do We 
Have? 

T he exact number of genes in the 
human genome has been difficult 
to pin down. Before the release of the 
draft sequences of the human genome 
in February 2001, it was thought that 
we have between 80,000 and 100,000 
genes. Surprisingly, estimates based 
on analyses of the draft sequences 
ranged from 23,000 to 49,000, with a 
consensus value of about 30,000 genes. 
By comparison, yeast f S. cerevisiae), 
fruit flies (D. melanogaster ), and round- 
worms (C. elegans) have about 6,000, 
13,600, and 20,000 genes, respectively. 
Considering our apparent biological 
complexity, it was perplexing that 
humans have only 50% more genes 
than a simple invertebrate and about 
as many as a tomato. To confound the 
gene number controversy, some 
researchers estimated that there were 
as many as 60,000, 89,000, or, quite 
possibly, 120,000 human genes. 

Clearly, a number of questions 
come to mind. How are gene numbers 
predicted? Why do gene counts vary? 


What is the likely number of human 
genes? What is the significance of a 
low gene number? 

One of the obstacles to an accurate 
human gene count is that protein¬ 
coding genes make up only 3 to 5% of 
the total genome. Although it is a 
belabored cliche, gene counters are 
looking for a handful of needles 
hidden in a large haystack. There 
must be a precise identification pro¬ 
cess for any realistic tally of the 
total number of human genes. 
Unfortunately, eukaryote gene struc¬ 
ture confounds gene enumeration. 
Exons are usually about 140 nucle¬ 
otides and are separated by introns 
that may be thousands of nucleotides 
in length. Also, the sequences at the 5' 
and 3' ends (5' and 3' untranslated 
regions) of mRNAs are not translated 
and vary from gene to gene. However, 
there are some consistent features that 
aid in the identification of protein¬ 
coding regions (ORFs) and that can be 
readily recognized. The translation 
start codon ATG distinguishes the first 
exon at the beginning of a protein¬ 
coding gene and is preceded by a 


transcription start site. At the 3' end of 
most genes, there are a translation 
stop codon, transcription termination 
site, and sequence that signals poly(A) 
tailing. Finally, the splice sites that 
precede and follow the internal exons 
are regular motifs. 

A number of gene prediction com¬ 
puter programs based on the common 
gene structures have been devised to 
scan the genome for possible genes. 
Generally, these programs are desig¬ 
nated ab initio, which means "from 
the beginning'' or, in the context of 
gene finding, an approach "based on 
first principles." The ab initio pro¬ 
grams predict coding nucleotides with 
high accuracy and recognize exons 
with good efficiency but are less than 
50% effective at finding complete 
genes. In addition, the numbers of 
falsely identified genes (false posi¬ 
tives) are high. 

Alignments between a comprehen¬ 
sive set of cDNAs and the human 
genome are another way of identi¬ 
fying genes. Full-length cDNAs, 
which are derived from extracted cel¬ 
lular mRNA, are an excellent resource 

(continued) 
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coding sequences can be determined using transcriptomics, mRNA levels 
do not always correlate with protein levels, and interactions between pro¬ 
teins cannot be assessed by these methods. From a practical standpoint, 
proteomics can be used to track clinical disorders and detect targets for 
therapeutic treatments. 

A number of factors complicate the study of proteins. First, in eukary¬ 
otes, there are many more proteins than genes due to alternative splicing, 
posttranslation modifications, and, to a lesser extent, posttranscriptional 
modifications to RNA (RNA editing). With about 60 to 70% of the estimated 
30,000 human genes (Box 5.1) undergoing alternative splicing, the human 
proteome may consist of 85,000 or more different proteins. Second, it is 
impossible to account experimentally for every member of a proteome with 
a single technique because proteins are susceptible to degradation; have dif¬ 
ferent properties, including different solubilities; and range considerably in 
abundance. In spite of these drawbacks, effective procedures have been 
devised for examining most of the components of many proteomes. 

Separation and Identification of Proteins 

Ordinarily, the study of the proteome of an entire multicellular organism is 
difficult due to the diverse functions of the many cells /tissues that comprise 


BOX 5.1 (continued) 


(continued) for finding genes because 
they contain all the exons of a gene. 
However, variation in the splicing of 
exons makes accurate gene identifica¬ 
tion somewhat arduous. More impor¬ 
tantly, relatively few complete human 
cDNAs have been synthesized and 
sequenced, although techniques have 
been developed for capturing intact 
capped mRNAs. The major sources of 
cDNA sequences are EST databases. 
These cDNAs are incomplete and con¬ 
centrated at the 3' ends of the mRNAs. 
In addition, the databases contain 
about 10 to 15 ESTs for each gene. In 
some studies, a single representative 
sequence for each cDNA was deter¬ 
mined to winnow down the overall 
number of cDNA sequences that 
would be needed for scanning genome 
sequences. The cDNA approach effec¬ 
tively locates exons. However, final 
gene counts are determined by com¬ 
bining the information about cDNA 
alignments with results from ab initio 
analyses. A number of genes are over¬ 
looked with this approach, because 
not all genes are represented in the 
cDNA databases. Also, miscounting 


splice variants as a number of indi¬ 
vidual genes instead of a single gene 
can inflate a gene count. 

Theoretically, the alignment of the 
complete genomic sequence of an 
organism, such as the mouse or puffer 
fish, to the entire human genome 
should locate highly conserved 
sequences that would likely be exons 
and regulatory elements and produce 
poor similarity scores for introns and 
the DNA between genes (intergenic 
DNA). The principle underlying this 
comparative genome strategy for gene 
enumeration is that introns and inter¬ 
genic DNA are not under the same 
biological constraint as exons and reg¬ 
ulatory regions. Thus, nucleotide 
changes accumulate in introns and 
intergenic DNA, with the result that 
over a long period of time, these 
sequences in different species diverge 
from one another. By contrast, the 
sequence similarity of exons and regu¬ 
latory elements is maintained between 
relatively closely related organisms 
because gene mutations lower repro¬ 
ductive fitness, which makes these 
sequence deviations less likely to be 


passed on from one generation to the 
next. Generally, on the basis of dif¬ 
ferent gene identification strategies, 
we probably have about 22,000 genes. 

The ENCODE (Encyclopedia of 
DNA Elements) project is a large col¬ 
laborative effort that is dedicated to 
elaborating in fine detail the molecular 
aspects of the human genome. Some 
important and exciting discoveries 
have been made by the ENCODE pilot 
project, which covered about 1% of the 
human genome. For example, virtu¬ 
ally all the DNA segments under 
study were transcribed, many coding 
regions have multiple transcription 
start sites, and, surprisingly, transcrip¬ 
tion often goes in both directions 
within the same segment of DNA. 
More studies are required to get a 
broader view of how the human 
genome functions. However, not only 
does the ENCODE pilot study point to 
more human genes being discovered, 
but the apparent complexity that has 
been observed strongly suggests that 
it may be necessary to redefine the 
current concept of the gene at the 
molecular level. 
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the organism. For convenience, the complexity is often reduced by exam¬ 
ining the protein complement of a cellular component or organelle, such as 
the nucleolus, nuclear matrix, lysosome, or endoplasmic reticulum. These 
protein subsets have been dubbed subproteomes. Because of its high 
resolving power, two-dimensional polyacrylamide gel electrophoresis (2D 
PAGE) is frequently used to separate a population of proteins in a sample 
(Fig. 5.10). Briefly, the proteins in a sample are first separated on the basis of 
their net charge by electrophoresis through an immobilized pH gradient in 
one dimension (the first dimension). Amino acids in a polypeptide have 
ionizable groups that contribute to the net charge of a protein; the degree of 
ionization (protonation) is influenced by the pH of the solution. In a gel to 
which an electric current is applied, proteins migrate through a pH gradient 
until they reach a specific pH (the isoelectric point) where the overall charge 
of the protein is zero and they no longer move (Fig. 5.10A). However, some 
proteins migrate to the same position in the pH gradient because they have 
the same net charge, although they have different molecular weights. These 
are further separated according to their molecular wieghts by electropho¬ 
resis at right angles to the first dimension (the second dimension) through a 
sodium dodecyl sulfate-polyacrylamide gel (Fig. 5.10B). The separated pro- 


FIGURE 5.10 2D PAGE for separation of proteins. (A) First dimension. Isoelectric 
focusing is performed to first separate the proteins in a mixture on the basis of their 
net charge. The protein mixture is applied to a pH gradient gel. When an electric 
current is applied, proteins will migrate toward either the anode (+) or cathode (-), 
depending on their net charge. As proteins move through the pH gradient, they will 
gain or lose protons until they reach a point in the gel where their net charge is zero. 
The pH in this position of the gel is known as the isoelectric point and is character¬ 
istic of a given protein. At that point, a protein no longer moves in the electric cur¬ 
rent. (B) Second dimension. Several proteins in a sample may have the same 
isoelectric point and therefore migrate to the same position in the gel in the first 
dimension. Therefore, proteins are further separated on the basis of differences in 
their molecular weights (MW) by electrophoresis, at a right angle to the first dimen¬ 
sion, through a sodium dodecyl sulfate-polyacrylamide gel. 
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teins form an array of spots in the gel that is visualized by staining the spots 
with Coomassie blue or silver protein stain. A 2D polyacrylamide gel can 
resolve up to 2,000 different proteins. The pattern of stained spots is cap¬ 
tured by densitometric scanning. Databases have been established with 
images of 2D polyacrylamide gels from different cell types. Software pack¬ 
ages are available for detecting spots, matching patterns between gels, and 
quantifying the protein content of the spots. Proteins with either low or high 
molecular weights, those that are found in cellular membranes, and those 
that are present in small amounts are not readily resolved by 2D PAGE. 
Also, highly charged proteins, such as ribosomal proteins and histone pro¬ 
teins, are not separated by standard conditions. The next task after separa¬ 
tion of most of the proteins of a proteome or subproteome is to excise the 
individual proteins from the gel, often using robotics to extract large num¬ 
bers of proteins, and to identify as many of the proteins as possible. Mass 
spectrometry (MS) is commonly used for this purpose. 

In principle, a mass spectrometer detects the mass of the ionized form 
of a molecule. For protein identification, proteins are fragmented into pep¬ 
tides that are ionized and separated according to their mass-to-charge (m/z) 
ratios, and then the abundances and m/z ratios of the ions are measured. 
The results are presented as a spectrum, with the x axis representing m/z 
ratios and the y axis representing the abundance of each ion relative to the 
most abundant ion. In practice, mass spectrometers have different configu¬ 
rations according to the nature (wet or dry) of the sample (analyte), the 
mode of ionization of the analyte, how the electric field(s) is established for 
accelerating the ions in order to separate and sort them, and the method for 
detecting the different masses. Mass spectrometric studies of proteins and 
peptides have been facilitated by effective ionization methods, such as 
electrospray ionization (ESI) and matrix-assisted laser desorption ioniza¬ 
tion (MALDI). Peptide masses are usually determined by MALDI-MS and 
amino acid sequences by ESI-tandem MS (ESI-MS-MS). MS is an impor¬ 
tant proteomic tool because analyses are rapid and accurate and require 
small amounts of starting material. Moreover, computational protocols are 
available for processing large amounts of MS data. 

Protein identification is straightforward because particular databases 
can be easily searched with either peptide mass or amino acid sequence 
data. For whole-proteome analysis, the cellular proteins must first be sepa¬ 
rated, typically by excision of individual protein spots from a 2D polyacryl¬ 
amide gel following electrophoresis. Each excised protein is then treated 
with the proteolytic enzyme trypsin, which cleaves on the C-terminal side 
of lysine and arginine residues. Contaminating salts, polyacrylamide, and 
other compounds are removed before the peptides are concentrated. 
MALDI-MS is used regularly to determine the mass ( m/z value) of each 
peptide fragment generated from the excised protein. Briefly, peptides are 
ionized by mixing them with a matrix consisting of an organic acid and 
then using a laser to promote ionization. The ions are accelerated through 
a tube using a high-voltage current, and the time required to reach the ion 
detector is determined by their molecular masses, with lower-mass ions 
reaching the detector first. The values of the observed peptide masses are 
matched with the expected masses of tryptic peptides for all known pro¬ 
teins (Fig. 5.11). This type of analysis is called peptide mass fingerprinting. 
Online sites (e.g., http://www.expasy.ch/tools/aldente/) are available 
that rapidly find the set of peptide masses of a known protein that most 
likely corresponds to those of the unknown protein. 
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FIGURE 5.11 Peptide mass fingerprinting. A spot containing an unknown protein that 
was separated by 2D PAGE is excised from the gel and treated with trypsin. 
Purified trypsin peptides are separated by MALDI-time of flight (TOF) MS. The set 
of peptide masses from the unknown protein are used to search a database that 
contains the masses of tryptic peptides for every known sequenced protein, and the 
best match is determined. The trypsin cleavage sites of known proteins are deter¬ 
mined from the amino acid sequence, and consequently, the masses of the tryptic 
peptides are easy to calculate. Only some of the tryptic peptide masses for the 
unknown protein are listed in this example. 
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Alternatively, the amino acid sequence of a peptide can be obtained by 
ESI-MS-MS and used to search a protein database to identify an unknown 
protein (Fig. 5.12). With this approach, the peptides derived from a protein 
spot in a 2D polyacrylamide gel are first separated by mass, and then one 
of the peptides is selected for sequencing. Fragmentation and ionization of 
a peptide occur by cleavage along the peptide backbone at amide bonds 
(peptide bonds) between amino acids. When the charge from ionization is 
retained at the N terminus, it is designated a b ion, and when it is retained 
at the C terminus, it is a y ion. Each ion type forms a ladder of subsequences 
that differ in size and consist of one, two (dipeptide), three (tripeptide), and 
more amino acids up to the full-length peptide. An amino acid sequence is 
determined from the mass values of ions of the same type (i.e., b ion or y 
ion) by calculating the difference in mass (Am) between subsequences. This 
difference for members of a y-ion ladder represents the successive loss of 
an amino acid from the N terminus. For the y-ion subsequences VFDEFK, 
FDEFK, DEFK, EFK, FK, and K, the difference from one subsequence to the 
next is the removal of one amino acid from the N terminus. In other words, 
for amino acid sequencing, the y ions form a reverse mass ladder. The con¬ 
verse holds true for a b-ion spectrum. Each difference of mass is equivalent 
to the mass of a known amino acid, except for leucine and isoleucine, 
which have the same mass. For example, with part of a y-ion series, the 
mass/charge ratios (m/z values) for five consecutive peaks from large to 
small are 1,171.50, 1,056.48, 942.43, 813.39, and 684.35, and the calculated 
successive differences are 115.02,114.05,129.04, and 129.04. Thus, based on 
known amino acid masses, the sequence is Asp-Asn-Glu-Glu. Automated 
programs are available that distinguish the ion types of a scan, remove as 
much background noise as possible, and calculate the most likely amino 
acid sequence. Protein identification does not require complete sequencing 
of all of the peptides. Frequently, partial sequences of two or three peptides 
are sufficient for effective similarity searches of protein databases. 

A system called shotgun proteomics that circumvents cumbersome 2D 
separation of proteins in a gel uses liquid chromatography combined with 
MS-MS (LC-MS-MS) for analyzing the proteins of a proteome. In this case, 
the entire mixture of proteins in a sample is initially treated with a protease. 
Then, the peptides are separated by LC, and the amino acid sequence of 
each peptide is determined by MS-MS. Finally, the proteins are identified 
by database searches. Hundreds and, in some cases, thousands of proteins, 
including those not well resolved by 2D PAGE, have been recorded for 
proteomes and subproteomes with this approach. 

Protein Expression Profiling 

Protein expression profiling is important for pinpointing changes during 
disease processes, cataloging differences between normal cells and cancer 
cells that can be used for diagnosis, and tracking cellular responses to toxic 
agents. Both gel and nongel methods have been developed for comparing 
the proteins of different samples. 

2D differential in-gel electrophoresis is very similar to 2D PAGE; how¬ 
ever, rather than separating proteins from different samples on individual 
gels and then comparing the maps of the separated proteins, proteins from 
two different samples are differentially labeled and then separated on the 
same 2D polyacrylamide gel. Typically, proteins of one sample are labeled 
with the fluorescent dye Cy3 and those of a second sample with Cy5 (Fig. 
5.13); the labeled samples are mixed and then run together in the same gel. 
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FIGURE 5.12 Protein identification by amino acid sequencing using ESI-MS-MS. A 
spot containing an unknown protein from a 2D polyacrylamide gel is excised and 
treated with trypsin to produce peptides. The tryptic peptides are separated 
according to their mass/charge (m/z) ratios, and the amino acid sequence of a 
selected peptide is determined with a mass spectrometer (MS). The selected peptide 
is fragmented and ionized to generate a ladder of smaller peptides that differ by 
single amino acids (only the y-ion ladder [yl, y2, etc.] is depicted here). The differ¬ 
ences in the masses of the fragmented peptides correspond to the characteristic 
masses of the amino acids. The unknown protein is identified by searching a pro¬ 
tein database with the amino acid sequences from two or more peptides. 


which overcomes the variability between separate gel runs. The two dyes 
carry the same mass and charge, and therefore, a protein labeled with Cy3 
migrates to the same position as the identical protein labeled with Cy5. The 
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Cy3 and Cy5 protein patterns are visualized separately by fluorescence 
excitation. The images are compared, and any differences are recorded. In 
addition, the ratio of Cy3 to Cy5 fluorescence for each spot is determined 
to detect proteins that are either up- or downregulated. Unknown proteins 
are identified by MS. 

The isotope-coded affinity tag (ICAT) method combined with 
LC-MS-MS is another way of comparing proteins from different sources 
(Fig. 5.14). An ICAT reagent consists of an affinity tag (biotin), a carbon 
chain (mass-encoded linker) that is labeled with either eight hydrogen 
(light form; FI) or eight deuterium (heavy form; D) atoms, and a chemical 


FIGURE 5.13 The 2D differential in-gel electrophoresis method for quantitative anal¬ 
ysis of protein expression. The proteins of two proteomes are labeled with the fluo¬ 
rescent dyes Cy3 and Cy5, respectively. The fluorescence-labeled samples are 
combined and separated by 2D PAGE. The gel is scanned for each fluorescent dye, 
and the relative levels of the two dyes in each protein spot are recorded. The gei is 
stained with a protein dye, and each spot with an unknown protein is excised and 
treated with trypsin. The peptides are separated by ESI-MS-MS, and the amino acid 
sequences of some selected peptides are determined. The protein is identified by 
searching a protein database for a likely match with amino acid sequences from two 
or more peptides. 
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group that covalently binds to an amino acid, usually cysteine. Deuterium 
is a stable isotope of hydrogen that is twice its mass, i.e., 2 daltons. Thus, 
the mass difference between a peptide that is labeled with a light and a 
heavy ICAT is 8 daltons. This difference is readily detected by MS. 

The proteins of one proteome are labeled with a light ICAT and those 
of another with a heavy ICAT. The samples are mixed, treated with trypsin, 
fractionated, and passed through an avidin column to capture biotin, 
which is present only on ICAT-labeled peptides (both the light and heavy 
versions). The purified ICAT-labeled peptides are separated by LC before 
their introduction into a mass spectrometer. The key feature of this tech¬ 
nique is that the same ICAT-labeled peptides from the two samples pro¬ 
duce a pair of signals with a defined difference in mass due to the light and 
heavy versions of the labels. The relative amounts of each pair of light and 
heavy peptides reflect the relative abundances of the source proteins in the 
original samples. Finally, with ESI-MS-MS, the amino acid sequences of the 
peptides are determined, and the proteins that match the sequences can be 
identified. Hundreds of proteins from different samples can be compared 
in this way. 

Protein Microarrays 

Conceptually, protein microarrays are similar to DNA microarrays; how¬ 
ever, rather than arrays of oligonucleotides or genes, protein microarrays 
consist of large numbers of proteins individually immobilized in known 
positions on the coated surface of a glass slide or silicon chip. The proteins 
arrayed on the surface can be antibodies specific for each protein in an 
organism, purified recombinant proteins, or short synthetic peptides. There 
are many ways of attaching a protein to a support surface. The major objec¬ 
tive of any coupling system is maintenance of protein structure and func¬ 
tion. Some systems bind the proteins to a chemical group that coats the 
surface of the support. With other protocols, recombinant proteins are pre¬ 
pared with a short amino acid sequence (tag) at the N or C terminus that 
binds to a recognition sequence on the support. In this case, all the protein 
molecules are uniformly oriented. In addition, instead of spotting proteins 
on a flat surface, some microarrays are engineered with tiny depressions 
(nanowells) that keep each protein moist and prevent mixing with adjacent 
proteins. 

The purpose of protein microarray analyses, for the most part, is to 
detect, on a large scale, the molecules that a protein interacts with. These 
interacting molecules can be other proteins, nucleic acid sequences, or low- 
molecular-weight compounds. Protein populations from different samples 
can be compared, for example, in control versus treated samples or in 
normal versus diseased tissues. There are a number of methods for visual¬ 
izing the interactions on protein microarrays. One common approach is to 
label the test samples directly with a fluorescent dye and then detect the 
labeled molecules that bind to the proteins of a microarray with a laser 
scanner (Fig. 5.15A). A two-dye strategy (e.g., Cy3 or Cy5), similar to that 
employed for DNA microarray analyses, can be used to compare proteins 
in two different samples on a single array. "Sandwich style" assays are also 
performed (Fig. 5.15B). Briefly, proteins in a sample are biotinylated and 
applied to the microarray. After the microarrays are washed to remove 
unbound proteins, streptavidin with a conjugated fluorescent dye is added, 
and the sample proteins that bind to the proteins on the microarray are 
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FIGURE 5.14 ICAT method for quantitative analysis of protein expression. (A) 
Schematic representation of an ICAT reagent. An ICAT reagent has either all 
hydrogen (H) or deuterium (D) atoms at sites (X) of a linker sequence. (B) ICAT 
protocol. Proteins are extracted from two proteomes, and one proteome is labeled 
with the light (H) hydrogen-only ICAT reagent and the other with the heavy (D) 
deuterium-only ICAT reagent. The samples are combined and treated with trypsin. 
The ICAT-labeled peptides are captured by affinity chromatography using avidin, 
which binds to biotin, and fractionated by LC. The ratio of light to heavy (H:D) ver¬ 
sions of the same peptide is determined by MS, which provides an estimate of the 
relative amounts of the protein in the two proteomes. The protein represented by a 
pair of heavy and light peptides is identified by amino acid sequencing with ESI- 
MS-MS and searching a protein database with this sequence for a likely match. 
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detected by laser scanning. In principle, the interpretation of the signals 
from a protein microarray is very similar to analysis of DNA microarrays. 

Broadly speaking, there are three types of protein microarrays: analyt¬ 
ical (capture), reverse phase, and functional. Analytical microarrays are 
used for protein profiling, that is, the detection and quantification of pro¬ 
teins present in a sample, and consist of either protein samples applied to 
immobilized antibodies or antibody samples applied to immobilized pro¬ 
teins. Antibody microarrays are often probed with proteins from biological 
sources, such as serum or plasma, or proteins that are secreted from cells in 
culture to determine disease-specific profiles. For example, antibody 
microarrays that specifically detect cytokines have been formulated. 
Cytokines, of which there are a large number, are small, secreted proteins 
(signaling proteins) that mediate and regulate immune and inflammatory 
responses, cell death (apoptosis), cell growth, blood vessel formation, and 
differentiation in humans and other animals. Cytokine antibody microar¬ 
rays are used to examine cytokines in both normal and diseased states and 
from a variety of sources after various treatments. A sandwich immuno¬ 
assay is often used to detect cytokines that bind to immobilized antibodies 
(Fig. 5.16). After the microarray is treated with a sample containing cyto¬ 
kines, biotinylated cytokine antibodies are added. A biotinylated cytokine 
antibody will bind to a cytokine that is bound to an immobilized antibody. 
Streptavidin with an attached fluorescent dye is added next. Then, the sig¬ 
nals are detected with a laser scanner and the data are analyzed. 

In one study, plasma samples from individuals with Alzheimer disease 
and those from individuals with no dementia were applied to a microarray 
made up of antibodies against 120 cytokines. Eighteen cytokines were 
found to be associated with Alzheimer disease. The levels of 7 of these were 


FIGURE 5.15 Protein microarray detection methods. (A) Direct labeling. The sample 
molecules are labeled with a detector reagent, e.g., fluorescent dye. (B) Sandwich 
style assay. The sample molecules are biotinylated, and after the initial incubation, 
a streptavidin-fluorescent-dye conjugate that binds to biotin to facilitate the detec¬ 
tion of sample molecules is applied. 
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FIGURE 5.16 Detection of cytokines with a cytokine antibody microarray. A cytokine 
antibody microarray (1) is incubated with a sample, and cytokines (solid circles) 
bind to specific antibodies (2). Free biotinylated cytokine antibodies are added and 
bind to the corresponding captured cytokine (3). For visualization, a streptavidin- 
fluorescent-dye conjugate attaches to the biotin of the secondary antibody (4). 


higher and 11 were lower in individuals with Alzheimer disease than in the 
subjects without dementia. The Alzheimer disease-associated cytokines 
adversely affect blood cell formation, immune responses, apoptosis, and 
nerve cell signaling. The Alzheimer disease cytokine pattern was also 
present in individuals with mild cognitive impairment. Currently, there is 
no definitive method for either predicting Alzheimer disease or deter¬ 
mining whether a patient has the disease. Possibly, the Alzheimer disease- 
specific cytokine signature may provide the basis for a diagnostic test for 
this degenerative disease that afflicts more than 4 million people in the 
United States. 

In another type of analytical microarray, proteins (antigens) are 
attached to a solid support and then probed with antibodies, mostly in 
serum samples. The purpose of these studies is to discover whether the 
production of antibodies against specific proteins correlates with particular 
diseases or biological processes. By way of illustration, a microarray with 
about 5,000 different human proteins was created and used to determine if 
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serum from ovarian cancer patients has a distinctive set of antibodies in 
comparison to the antibody population of healthy individuals. The initial 
results revealed 94 proteins that were specifically recognized by antibodies 
in the sera from the ovarian cancer patients. With further testing, three 
proteins were consistently found to be specific for ovarian cancer. The ear¬ 
lier ovarian cancer is diagnosed, the better the chance of survival. 
Unfortunately, ovarian cancer is currently identified in a late stage. The 
ovarian-cancer-specific proteins may help in the early detection of the dis¬ 
ease. 

Antibody microarrays have also been used to determine whether par¬ 
ticular posttranslational protein modifications, such as phosphorylation of 
tyrosine or glycosylation, are associated with specific diseases. To screen 
for proteins that contain phosphotyrosine, proteins are first captured by 
primary antibodies immobilized on a microarray, and then the microarray 
is flooded with biotinylated anti-phosphotyrosine antibody. Next, strepta- 
vidin conjugated with a fluorescent dye is added, and the protein spot with 
the fluorescence is detected (Fig. 5.17A). In a similar manner, glycosyl 
groups (glycans) present on proteins can be visualized by adding a biotiny¬ 
lated lectin to proteins that are bound to immobilized antibodies on an 
array (Fig. 5.17B). Lectins are plant glycoproteins that bind to specific car¬ 
bohydrate moieties on the surfaces of proteins or cell membranes, and 
many different lectins with affinities for different glycans are available. The 
use of lectins can facilitate the detection of specific protein glycosylation 
patterns. 

The utility of antibody microarrays has been enhanced by the expan¬ 
sion of libraries that produce highly specific antibodies. For example, 
clones making antibodies against more than 1,800 human proteins have 
been isolated, characterized, and validated. The long-term objective of this 
project is to have available one antibody for each protein from every coding 
sequence in the human genome, that is, a total of about 22,000 antibodies. 

With a reverse-phase microarray, a multiprotein sample, for example, 
from a cell lysate or tissue specimen, is immobilized in a single spot on a 
support. Several such multiprotein samples are spotted on the microarray, 
which is then probed with a single target molecule. This format contrasts 
with analytical and functional microarrays, in which immobilized spots 
containing single proteins are probed with multiple targets; hence, the term 
"reverse" is used. The advantage of the reverse-phase microarray is that a 
large number of samples can be compared at one time. With a reverse- 
phase microarray, the presence of specific proteins in multiple complex 
samples can be readily determined (Fig. 5.18). 

Functional protein microarrays feature large sets of individual proteins 
that are used predominately to determine interactions with other proteins 
or low-molecular-weight compounds, such as lipids, drugs, and metabo¬ 
lites (Fig. 5.19). Enzymes, such as oxidoreductases, kinases, proteases, and 
glycosidases, with novel activities that are useful for biotechnology and 
medicine are likely to be discovered using functional microarrays. Ideally, 
a functional protein microarray should consist of all possible proteins of a 
proteome under study. To obtain comprehensive representation of a pro- 
teome, a library containing all of the protein coding sequences is first con¬ 
structed. The term ORF, which stands for open reading frame, is used to 
represent any protein-coding sequence, and a library of cloned protein¬ 
encoding ORFs has been dubbed an ORFeome. Large-scale expression of 
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FIGURE 5.17 Detection of post-translational modifications with antibody microar¬ 
rays. (A) Detection of tyrosine phosphorylation. An antibody microarray (1) is 
incubated with a protein sample (2). Biotinylated phosphotyrosine antibody is 
added (3), and for visualization, a streptavidin-fluorescent-dye conjugate attaches 
to the biotin of the phosphotyrosine antibody (4). (B) Detection of glycan groups. 
An antibody microarray (1) is incubated with a protein sample (2). A biotinylated 
molecule (e.g., lectin) that binds to a specific glycan is added (3), and for visualiza¬ 
tion, a streptavidin-fluorescent-dye conjugate attaches to the biotin of the lectin 
(4). 


the library is followed by purification of each of the proteins, which are 
subsequently arrayed on a solid support. 

The starting point for producing an ORFeome is usually PCR amplifi¬ 
cation of the coding sequences for cloning into a vector. For prokaryotic 
organisms, the protein-coding sequences can often be readily identified 
from genomic sequences. On the other hand, full-length cDNA libraries 
are the primary sources of the coding sequences of a eukaryotic pro- 
teome. 
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Microarray support 



FIGURE 5.18 Reverse-phase microarray format. Multiprotein samples, e.g., cell 
lysates, are spotted on a solid support (1) and incubated with a known biotinylated 
antibody (2). A streptavidin-fluorescent-dye conjugate is used to identify samples 
with bound antibody (3). 


A rapid and versatile system for efficient cloning of PCR-generated 
ORFs without costly and time-consuming restriction endonuclease and 
ligation reactions is known as recombinational cloning (also known as 
Gateway cloning technology) and exploits the bacteriophage A system for 
integration and excision of viral DNA into the host bacterial genome. 
Briefly, as background for understanding recombinational cloning, the inte¬ 
gration of bacteriophage A in the E. coli chromosome requires a specific 
attachment sequence in the bacteriophage genome (243-bp attachment 
phage [attP] site) and another sequence in the bacterial genome (25-bp 
attachment bacteria [attB] site), plus an E. coli-e ncoded protein called inte¬ 
gration host factor and the bacteriophage A recombination protein inte- 
grase (Fig. 5.20A). Recombination between the attP and attB sequences 
results in insertion of the phage genome into the bacterial genome to create 
a prophage with the attachment sites attL (100 bp) and nttR (168 bp) at the 
left and right ends of the integrated bacteriophage A DNA, respectively. For 
subsequent excision of the bacteriophage A DNA from the bacterial chro¬ 
mosome, recombination between the attL and attR sites is mediated by 
integration host factor, integrase, and bacteriophage A excisionase (Fig. 
5.20B). The recombination events occur at precise locations without either 
the loss or gain of nucleotides. 

For recombinational cloning, bacterial A attachment sites are added to 
each ORF during PCR amplification. The PCR primers carry modified attB 
sequences that recombine only with specific attP sequences. For example, 
attBl recombines only with attPl, and attB2 recombines with attP2. One of 
the primers has the attBl sequence, and the other primer has attB2 (Fig. 
5.21). If required, translation start and termination codons can be added to 
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FIGURE 5.19 Functional protein microarray platform. The individual proteins of a 
functional protein microarray can be examined for interactions with proteins, 
lipids, and drugs, among other compounds, and tested for substrate binding and 
enzyme activities. Here, for convenience, direct labeling of the input sample is 
depicted as the detection method. However, there are a variety of visualization 
protocols for different types of samples. 


an ORF by including the corresponding nucleotides on the PCR primers. A 
nucleotide sequence that encodes a short amino acid sequence (affinity tag) 
that is in frame with each ORF is also included on one of the primers. The 
affinity tag enables a protein to be selectively purified and may be added 
to the N and/or C terminus of the protein. Another short sequence 
encoding an in-frame peptide provides a cut site that is used to remove the 
affinity tag after the protein is purified. 

Following amplification of a specific ORF by PCR using the primer 
pair, the PCR products carrying the ORF with flanking attBl and attB2 
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Bacteriophage X 





FIGURE 5.20 Integration (A) and exci¬ 
sion (B) of bacteriophage X into and 
from the E. coli genome via recombina¬ 
tion between attachment (off) sites in 
the bacterial and bacteriophage DNAs. 


sequences are mixed with a vector (donor vector) that has attPl and attP2 
sites flanking a negative selection gene (ccdB) (Fig. 5.22A). When present 
and expressed, the product of the ccdB gene interferes with DNA replica¬ 
tion and is toxic to bacterial cells. Integration host factor and integrase are 
added to the mixture of DNA molecules to catalyze in vitro recombination 
between the attBl and attPl sites and between the attB2 and attP2 sites. As 
a consequence of the two recombination events, the ccdB gene sequence 
between the attPl and attP2 sites on the donor vector is replaced by the 
ORF. The recombination events create new attachment sites flanking the 
ORF sequence (designated attLl and attL2), and the plasmid with the 
attLl-ORP-attL2 sequence is referred to as an entry clone. The mixture of 
original and recombinant DNA molecules is transformed into E. coli, and 
cells that are transformed with donor vectors that have not undergone 
recombination retain the ccdB gene and therefore do not survive. Flost cells 
carrying the entry clone are positively selected by the presence of a select¬ 
able marker. This procedure is repeated to clone each of the ORFs in the 
proteome. 

The next step to obtain functional proteins is the expression of each 
cloned ORF. For expression, the ORF is transferred from the entry vector to 
a destination vector that carries a promoter and other expression signals. 
Am entry clone is mixed with a destination vector that has attRl and attR2 
sites flanking a ccdB gene (Fig. 5.22B). In the presence of integration host 
factor, integrase, and bacteriophage X excisionase, the attLl and attL2 sites 
on the entry clone recombine with the attRl and attR2 sites, respectively, on 
the destination vector. This results in the replacement of the ccdB toxin gene 
on the destination vector with the ORF from the entry clone, and the resul¬ 
tant plasmid is designated an expression clone. The reaction mixture is 
transformed into E. coli, and a selectable marker is used to isolate trans¬ 
formed cells that carry an expression clone. Cells that carry an intact desti¬ 
nation vector or the exchanged entry plasmid (known as a by-product 
plasmid) will not survive because they carry the ccdB gene. Destination 
vectors are available for maintenance and expression of the ORF in various 
host cells, such as E. coli and yeast, insect, and mammalian cells. For con¬ 
struction of a microarray, each protein encoded by an ORF is isolated by 
affinity purification using the affinity tag that was encoded on the initial 


FIGURE 5.21 Primer pair used to amplify ORFs for recombinational cloning to gen¬ 
erate an ORFeome. 
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FIGURE 5.22 Recombinational cloning. (A) Recombination (thin vertical lines) 
between a PCR-amplified ORF with attachment sites (attBl and attB2) and a donor 
vector with attPl and attP2 sites on either side of the ccdB gene results in an entry 
clone in which the ORF is flanked by attLl and attL2 sites. The selectable marker 
(SMI) selects transformed cells with an entry clone. The protein encoded by ccdB is 
toxic to transformed cells with nonrecombined donor vector molecules. The origin 
of replication of the donor vector is not shown. (B) Recombination (thin vertical 
lines) between the entry clone with attLl and attL2 sites and a destination vector 
with attRl and attR2 sites results in an expression clone with attBl and attB2 sites 
flanking the ORF. The selectable marker (SM2) selects transformed cells with an 
expression clone. The second plasmid, designated a by-product, has the ccdB gene 
flanked by attPl and attP2 sites. Cells with an intact destination vector that did not 
undergo recombination or that retain the by-product plasmid are killed by the CcdB 
protein. Transformed cells with an entry clone, which lacks the SM2 selectable 
marker, are selected against. The origins of replication and the sequences for expres¬ 
sion of the ORF are not shown. 


PCR primer used to amplify the ORFs (Fig. 5.21). Protocols have been 
developed to isolate thousands of proteins in parallel to facilitate the cre¬ 
ation of proteomic microarrays. 

Protein-Protein Interaction Mapping 

Proteins seldom act alone. On average, one protein interacts with five 
others. Some protein-protein interactions are short-lived, others form 
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FIGURE 5.23 Two-hybrid assay for 
detecting pairwise protein interactions. 
(A) The DNA-binding domain of a tran¬ 
scription factor binds to a specific 
sequence in the regulatory region of a 
gene, which orients and localizes the 
activation domain that is required for the 
initiation of transcription of the gene by 
RNA polymerase. (B) The coding 
sequences for the DNA-binding domain 
and the activation domain are fused to 
DNA X and DNA Y, respectively, and 
both constructs (hybrid genes) are intro¬ 
duced into a cell. After translation, the 
DNA-binding domain-protein X fusion 
protein binds to the regulatory sequence 
of a reporter gene. However, protein Y 
(prey) does not interact with protein X 
(bait), and the reporter gene is not tran¬ 
scribed because the activation domain 
does not, on its own, associate with RNA 
polymerase. (C) The coding sequence for 
the activation domain is fused to the 
DNA for protein Z (DNA Z) and trans¬ 
formed into a cell containing the DNA- 
binding domain-DNAX fusion construct. 
The proteins encoded by the cDNAs of 
the hybrid genes interact, and the activa¬ 
tion domain is properly oriented to ini¬ 
tiate transcription of the reporter gene, 
demonstrating a specific protein-protein 
interaction. 
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stable multicomponent complexes, and at a higher level of cellular organi¬ 
zation, complexes interact with one another. Determining the functional 
interconnections among the members of a proteome is not an easy task. The 
strategies for examining protein-protein interactions on a large scale 
require a number of experimental manipulations, with no guarantee that 
all potential interactions will be recognized. Notwithstanding the limita¬ 
tions of existing protocols, thousands of protein-protein interactions for 
proteomes of single-celled and multicellular organisms have been cata¬ 
logued. 

The two-hybrid method that was originally devised for studying the 
yeast proteome has been used extensively to determine pairwise protein- 
protein interactions in both eukaryotes and prokaryotes. The underlying 
principle of this assay is that the physical connection between two proteins 
reconstitutes an active transcription factor that initiates the expression of a 
reporter gene. Generally, transcription factors have two domains. One 
region is required for binding to a specific DNA site (DNA-binding 
domain), and the other region activates transcription (activation domain) 
(Fig. 5.23A). These two domains need not be part of the same protein to 
function as an effective transcription factor. Flowever, the activation 
domain alone will not bind to RNA polymerase to activate transcription. 
Connection with the DNA-binding domain is necessary to place the activa¬ 
tion domain in the correct orientation and location to initiate transcription 
by RNA polymerase. 

For a two-hybrid system, the coding regions of the DNA-binding and 
activation domains of a specific transcription factor are isolated and 
cloned into separate vectors. Often, the Gal4 transcriptional factor from 
Sacclmromyces cerevisiae or the bacterial LexA transcription factor is used. A 
cDNA sequence is cloned in frame with the DNA-binding domain 
sequence to form a fusion gene that produces a hybrid protein. A protein 
attached to the DNA-binding domain is called the "bait," or target. This is 
the target protein for which interacting proteins are to be identified. 
Another cDNA sequence is cloned adjacent to the activation domain 
coding sequence. A protein attached to the activation domain is called the 
"prey" and potentially interacts with the target, or bait, protein. Flost cells 
are transformed with both bait and prey DNA constructs. After expression 
and translation, if the bait and prey do not interact, then there is no tran¬ 
scription of the reporter gene (Fig. 5.23B). Flowever, if the reporter gene is 
transcribed, then a physical connection occurred between the bait and 
prey proteins that brought the DNA-binding and activation domains 
together and enabled the activation domain to make contact with RNA 
polymerase (Fig. 5.23C). In other words, there was a specific interaction 
between the bait and prey proteins. The product of an active reporter gene 
may either allow a host cell to proliferate in a specific medium or produce 
a colorimetric response. 

A variant of the two-hybrid system has been developed for studying 
protein interactions in mammalian cells. With this scheme, two genetically 
altered subunits, a and co, of an indicator enzyme ((3-galactosidase) are 
unable to associate under normal cellular conditions, and as a result, there 
is no (3-galactosidase activity (Fig. 5.24A). cDNAs are cloned in frame with 
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FIGURE 5.24 Complementation assay for 
detecting pairwise protein interactions 
in mammalian cells. (A) Proteins a and © 
must combine to produce an active 
enzyme but are not able to interact spon¬ 
taneously due to mutations. (B) DNA 
fusion constructs of the gene encoding 
protein a (gene a) with a cDNA (cDNA 
X) and the gene encoding protein © 
(gene w) with a cDNA (cDNA Y) are 
introduced into a cell. Since proteins X 
(bait) and Y (prey) do not interact, pro¬ 
teins a and © do not associate, and the 
activity specified by the a:© combination 
is not observed. (C) DNA fusion con¬ 
structs of the gene encoding protein a 
(gene a) with a cDNA (cDNA X) and the 
gene encoding protein © (gene ©) with a 
cDNA (cDNA Z) are introduced into a 
cell. Proteins X and Z interact, bringing 
together proteins a and ©, and the 
activity specified by the a:© combination 
is observed. 
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the genes for proteins a and co, and the constructs are tested by trans¬ 
forming both into a mammalian cell (Fig. 5.24B and C). If two cDNA-encoded 
proteins interact, then proteins a and co are brought into close proximity, 
and the formation of a functional (3-galactosidase is detected with a colori¬ 
metric assay (Fig. 5.24C). 

As a first step in a large-scale protein-protein interaction study, two 
libraries are prepared, each containing thousands of cDNAs generated 
from total cellular mRNA (or genomic DNA fragments in a study of pro¬ 
teins from a prokaryote). In one case, the cDNAs are cloned into the vector 
adjacent to the DNA sequence for the DNA-binding domain of the tran¬ 
scription factor Gal4 to form a library of hybrid bait genes. In the other 
case, the cDNAs are cloned into a different vector containing the sequence 
for the activation domain to form the prey library. The libraries are typi¬ 
cally screened for bait-prey protein interactions in one of two ways. In one 
method, a library of yeast cells containing the prey-activation domain con¬ 
structs is arrayed on a grid. The library is then screened for the production 
of proteins that interact with a hybrid bait protein by introducing indi¬ 
vidual bait-DNA-binding domain constructs to the arrayed clones by 
mating (Fig. 5.25A). Alternatively, each yeast clone in a bait library is mated 
en masse with a mixture of strains in the prey library, and then positive 
interactions are identified by screening for activation of the reporter gene 
(Fig. 5.25B). Challenges with using the two-hybrid system for large-scale 
determination of protein-protein interactions include the inability to clone 
all possible ORFs in the libraries in frame with the activation and DNA- 
binding domains, which leads to missed interactions (false negatives) and 
the detection of interactions that do not normally occur in their natural 
environments within the original cells and therefore are not biologically 
relevant (false positives). Nonetheless, this approach has been used to suc¬ 
cessfully identify interacting proteins in several organisms, including bac¬ 
teria, viruses, yeast, the fruit fly Drosophila melanogaster, the roundworm 
Caenorhabditis elegans, and humans. 

Many schemes have been devised to streamline the acquisition of 
results. Regardless of the protocol, vast numbers of interactions are scored. 
Specialized computer programs are required to categorize and map all the 
relationships. As part of this analysis, stringent statistical criteria are used 
to minimize the numbers of possible false-positive interactions in the final 
data set. With D. melanogaster, an overall protein interaction (interactome) 
map of 3,000 interactions with 3,522 proteins was delineated. In addition, 
the nuclear, cytoplasmic, and extracellular locations of 2,268 interactions 
with 2,346 proteins were mapped. Finally, smaller interacting sets of pro¬ 
teins within cellular regions were noted (Fig. 5.26). Another study with 
Drosophila revealed 710 protein-protein interactions with 641 proteins. 
Surprisingly, there was little congruence between the two studies. Clearly, 
the technical reasons for this difference need to be determined. A network 
of 5,500 protein interactions was constructed for C. elegans, and a similar 
number was determined for S. cerevisiae. Protein interaction maps place 
proteins with unknown functions into contexts that provide clues about 
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FIGURE 5.25 Large-scale screens for pro¬ 
tein interactions using the yeast two- 
hybrid system. Two libraries are 
prepared, one containing genomic DNA 
fragments or cDNAs fused to the coding 
sequence for the DNA-binding domain 
of a transcription factor (bait library) and 
another containing genomic DNA frag¬ 
ments or cDNAs fused to the activation 
domain of the transcription factor (prey 
library). Two methods are commonly 
used to screen for pairwise protein inter¬ 
actions. (A) Individual yeast strains in 
the bait library are mated with each 
yeast strain in an arrayed prey library. 
The resulting strains in the array that 
produce bait and prey proteins that 
interact are detected by assaying for 
reporter gene activation (activated cells 
growing in a multiwell plate are indi¬ 
cated in green). (B) Yeast strains in the 
prey library are mated en masse with 
individual strains in the bait library. The 
mixture of strains is screened for reporter 
gene activity, which identifies strains 
with interacting bait and prey proteins 
(green). 
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FIGURE 5.26 Protein interaction map of calcium signaling protein clusters of D. 
melanogaster. Individual proteins (solid circles) are not named. The thickness of a 
connecting line denotes the extent of the interaction rating. Adapted from Giot et 
al.. Science 302:1727-1736, 2003. 

their roles in cellular processes and identify proteins with multiple func¬ 
tions. 

Instead of studying pairwise protein interactions, the tandem affinity 
purification (TAP) tag procedure is designed to capture multiprotein clus¬ 
ters and then identify the components by MS (Fig. 5.27). In this case, a 
cDNA sequence that encodes the target (bait) protein is fused to a DNA 
sequence that encodes two short amino acid sequences (tags). An amino 
acid sequence tag binds with high affinity to a specific molecule and facili¬ 
tates purification of the target protein. A "two-tag" system allows two suc¬ 
cessive rounds of affinity binding to ensure that the target and its associated 
proteins are free of any nonspecific proteins. Alternatively, a "one-tag" 
system with a small protein tag that is immunoprecipitated with a specific 
antibody requires only a single purification step. In a number of trials, the 
tags did not alter the functions of various test proteins. 

A cDNA-two-tag construct is introduced into a host cell, where it is 
expressed, and a tagged protein is synthesized. The underlying assumption 
is that the cellular proteins that normally interact with the native protein in 
vivo will also combine with the tagged protein. After the cells are lysed, the 
target protein and any interacting proteins are purified using the affinity 
tags. The proteins of the cluster are separated according to their molecular 
weights by PAGE. Individual bands are excised and treated with trypsin, 
and the proteins are identified by MS. Computer programs are available for 
generating maps of clusters with common proteins, assigning proteins with 
shared interrelationships to specific cellular activities, and establishing the 
links between multiprotein complexes. 
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FIGURE 5.27 TAP tag procedure for detecting protein-protein interactions. Two 
DNA sequences (tag 1 and tag 2), each encoding a short amino acid sequence with 
high affinity for a specific molecule, are cloned together and fused in frame to the 
3' end of the coding region of a cDNA (cDNA X). The tagged cDNA construct is 
introduced into a host cell, where it is transcribed and the mRNA is translated. 
Other cellular proteins bind to the protein encoded by cDNA X (protein X). The 
cluster consisting of protein X and its associated proteins is separated from cell 
components (squares) by the binding of tag 1 to its affinity partner, which is usually 
fixed to a column that retains the cluster while allowing all noninteracting proteins 
to flow through. The cluster is eluted from the affinity partner, typically by cleaving 
off tag 1, and a second purification step is carried out with tag 2 and its affinity 
partner. The proteins of the cluster are separated by one-dimensional PAGE. Single 
bands are excised from the gel and treated with trypsin. The protein represented by 
the tryptic peptides is identified by either peptide mass fingerprinting or searching 
a protein database with peptide amino acid sequences obtained with ESI-MS-MS. 
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SUMMARY 


B ioinformatics grew as a field of study, for the most part, 
from the efforts to maintain, organize, analyze, and make 
accessible large amounts of gene and genomic sequence infor¬ 
mation. GenBank, a gene sequence database, was established 
in the early 1980s to cope with the influx of DNA sequences in 
the scientific literature. By the mid-1990s, a myriad of data¬ 
bases had been developed for genomic sequences, genetic and 
physical maps, ESTs, and many other types of molecular data. 
The expansion of the Internet and the availability of browsers 
led to enhanced pictorial presentation of information stored in 
databases, and the development of computational tools 
enabled rapid searching of DNA or protein databases with 
query sequences and other kinds of sequence analyses. 
Currently, there are hundreds of public online molecular data¬ 
bases. 

Metagenomics is the study of the genome sequences of a 
complex population of microorganisms. DNA consisting of 
many different microbial genomes can be extracted directly 
from environmental samples and sequenced, and the sequences 
of as many complete genomes as possible can be assembled 
and analyzed. With sequence database comparisons, the 
organisms are placed among their phylogenetic relatives, if 
any exist. By determining the genetic capacities of annotated 
sequences, physiological profiles can be predicted for organ¬ 
isms that cannot currently be grown in the laboratory. 
Moreover, variations in the nucleotide sequences of various 
genes are determined to discover proteins with unusual, and 
potentially useful, characteristics. 

The expression of thousands of genes in cells or tissues can 
be tracked simultaneously by hybridization of target sequences 
in extracted cellular mRNA to the bound probes of cDNA or 
oligonucleotide microarrays. DNA microarrays are used to 
quantify differences in gene expression among samples, such 
as between diseased and normal tissues or mutant and wild- 
type cells, and among cells exposed to different internal or 
external conditions. Diagnostic tests based on DNA microarray 
patterns are being developed that detect the expression of 
genes known to be associated with a diseased state (molecular 
markers). SAGE relies on sequence information from short 
segments at the ends of cDNAs derived from a population of 
mRNAs to assess which genes are up- or downregulated 
under various conditions. 

The comprehensive study of all the proteins of a cell, tissue, 
body fluid, or organism, i.e., a proteome, is called proteomics. 
High-throughput methods have been developed to identify 
most members of a proteome, compare the levels of individual 
proteins among proteomes, and characterize thousands of 


protein-protein interactions within a proteome. There are two 
main ways of obtaining comparative protein expression pro¬ 
files. With these approaches, proteomes from two different 
sources are mixed and analyzed together. For 2D differential 
in-gel electrophoresis, the proteins of two samples are labeled 
with two different fluorescent dyes and separated by 2D 
PAGE. Then, the specific fluorescent emissions from the indi¬ 
vidual proteins are recorded to determine the relative propor¬ 
tions of the proteins in each sample. In another method, 
peptides from two different samples are labeled with a heavy 
form (containing eight deuterium atoms) or a light form (with 
no deuterium atoms) of an ICAT. The same ICAT-labeled pep¬ 
tides in the two samples produce a pair of signals with a 
defined difference in mass due to the light and heavy versions 
of the labels. MS is used to measure the relative abundances of 
the proteins in the original proteomes. 

Protein microarrays consist of a large number of proteins 
immobilized in a small area on a support to facilitate mas¬ 
sively parallel analyses of interactions between proteins, 
between proteins and low-molecular-weight compounds, and 
between antibodies and antigens. Based on the immobilized 
proteins, there are three kinds of protein microarrays: analyt¬ 
ical, reverse phase, and functional. Analytical microarrays use 
immobilized antibodies to capture proteins or immobilized 
proteins to bind antibodies. Multiprotein complexes, such as 
cell lysates, are bound to the support for reverse-phase 
microarrays. Functional protein microarrays contain as many 
members of a proteome or subproteome as possible to study 
the activities of proteins by their interactions with other pro¬ 
teins or small molecules. Recombinational cloning is often 
used to generate libraries containing the ORFs for all of the 
proteins encoded in a genome. 

Protein interaction maps can reveal novel functions for 
proteins, place proteins with unknown functions among those 
with known cellular roles, and elucidate new biological path¬ 
ways and cellular machines. The yeast two-hybrid system 
detects interactions between two proteins (bait and prey) that 
bring together the activation and DNA-binding domains of a 
transcription factor required for the initiation of transcription 
of a reporter gene. Large networks of protein-protein interac¬ 
tions are revealed by many thousands of pairwise assays. The 
TAP tag procedure is used to isolate clusters of proteins that 
associate with a test (bait) protein in vivo. In this method, the 
individual proteins of a purified protein complex are sepa¬ 
rated by one-dimensional PAGE and identified by either pep¬ 
tide mass fingerprinting or MS amino acid sequencing and a 
database similarity search. 
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REVIEW QUESTIONS 

1. The Basic Local Alignment Search Tool (BLAST) finds 
regions of similarity between sequences. For a brief introduc¬ 
tion to a BLAST nucleotide search, go to http://www.ncbi. 
nlm.nih.gov/blast/Blast.cgi and click on the link labeled 
"nucleotide blast." Type accession number NM_000492.3, 
which is the unique identifier assigned to a specific DNA 
sequence, into the "Enter accession number..." box. In the 
pull-down menu in the "Database" field, select "Nucleotide 
collection (nr/nt)." Check the "Show results in a new window" 
box. Click on "Algorithm parameters," and select 10 from the 
"Max target sequences" menu. Click on BLAST. From the 
results of the nucleotide sequence, determine the identity of 
the gene that has been given accession number NM_000492.3. 

2. What is the likely protein that contains the amino 
acid sequence LSPQMSGEEEDSDLAAKLGMCNREIVR 
RGA? To answer this question, go to http://www.ncbi.nlm. 
nih.gov/blast/Blast.cgi and click on the link labeled "Protein 
blast." Enter the amino acid sequence into the "Enter accession 
number..." field. Name the job "test" or whatever appeals to 
you. Check "Show results in a new window." Click on 
"Algorithm parameters," and select 10 from the "Max target 
sequences" menu. Click on BLAST. The results should reveal 
the identity of the protein containing the amino acid 
sequence. 

3. What is a DNA microarray? 

4. Describe a cDNA microarray gene expression profiling 
system. 

5. How is the target sample prepared for an oligonucleotide 
microarray experiment? 

6. What is an affinity tag? 

7. What are the objectives of gene expression profiling experi¬ 
ments? 

8. Describe the SAGE procedure. 

9. Find the best gene for the following SAGE tags: 
ATCGCTTTCT, TTTTGTCATT, and GGCCCCAGTT. Go to the 
SAGE Anatomic Viewer (http://cgap.nci.nih.gov/SAGE/ 
AnatomicViewer), and under the "Find the Best Gene for a 


Tag" heading, enter one of the query tag sequences at a time 
in the "10 bp tag" box, and then click on Go. 

10. Why are there so many more proteins than genes in 
humans? 

11. What are the principal features of 2D PAGE? 

12. Describe how unknown members of a proteome can be 
readily identified. 

13. What is a peptide mass fingerprint? 

14. Are you curious about the unknown protein in Fig. 5.11? 
If so, type the peptide masses of the unknown protein as a 
column in a word processor. Omit the dots. Copy the column 
to the clipboard. Go to http://www.expasy.org/cgi-bin/ 
aldente/form.cgi. Paste the list of peptide masses into the 
"Peak list" box. Change "Minimum number of Hits" from 
four to nine. Enter your e-mail address in the appropriate box. 
Click on "Submit." An e-mail message should arrive within 10 
minutes with the results. What is the unknown protein? You 
may also want to check out the peptide mass fingerprint pro¬ 
gram at http://www.matrixscience.com. Click on the link 
labeled "Mascot." Click on "Peptide Mass Fingerprint." Enter 
your name and e-mail address. Give a name to the project in 
the "Search title" field. Paste the peptide masses into the 
"Query" box. Select 5 in the "Report top" field. Click on "Start 
Search...." Note the "Top Score" and other information on the 
Mascot Search Results page in your browser. 

15. The following ladder of y-ion masses ( m/z ratios) was 
observed with MS after fragmentation and ionization of a 
peptide: 1,136, 980, 851, 723, 609, 512, 411, 324, 196, and 99. 
What is the likely amino acid sequence of the peptide? A list 
of average amino acid masses can be obtained from a number 
of online sites. Simply do an online search with the phrase 
"amino acid masses." 

16. What is the ICAT method? 

17. Describe the major features of the two-hybrid assay. 

18. What is the TAP tag system? 

19. What is a protein microarray? 
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20. Compare and contrast analytical and reverse-phase 
microarrays. 

21. Describe how a library of ORFs that represents a proteome 
is constructed by recombinational cloning. 


22. What types of molecules can be captured by a functional 
protein microarray? 

23. What are the objectives of protein-protein interaction 
studies? 
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Intrinsic Protein Stability 
Facilitating Protein Folding 
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Manipulation of Gene 
Expression in Prokaryotes 


T he primary objective of gene cloning for biotechnological applica¬ 
tions is the expression of the cloned gene in a selected host organism. 
Unfortunately, the insertion of a gene into a cloning vector does not 
necessarily ensure that it will be successfully expressed. Moreover, for 
many commercial purposes, a high rate of production of the protein 
encoded by the cloned gene is required. In response to the need for a high 
expression rate, many specialized expression vectors have been created 
that provide genetic elements for controlling transcription, translation, 
protein stability, and secretion of the product of the cloned gene from the 
host cell. The molecular biological features that have been manipulated to 
modulate gene expression include the promoter and transcription termi¬ 
nator sequences, the strength of the ribosome-binding site, the number of 
copies of the cloned gene and whether the gene is plasmid borne or inte¬ 
grated into the genome of the host cell, the final cellular location of the 
synthesized foreign protein, the efficiency of translation in the host 
organism, and the intrinsic stability within the host cell of the protein 
encoded by the cloned gene. There is no single strategy for obtaining 
maximal expression of every cloned gene. Consideration of the distinctive 
features of a cloned sequence is usually required before an optimal level of 
expression is found. 

The level of foreign-gene expression also depends on the host organism. 
Currently, although a wide range of both prokaryotic and eukaryotic 
organisms can express foreign genes, many of the commercially important 
proteins produced by recombinant DNA technology are synthesized in 
Escherichia coli. The extensive use of £. coli is understandable in light of the 
vast amount of research that has been carried out on its genetics, molecular 
biology, biochemistry, and physiology. Moreover, foreign proteins can usu¬ 
ally be produced by this organism rapidly and inexpensively. However, 
other host systems, such as Bacillus subtilis, yeasts, fungi, and animal, plant, 
and insect cells, are used to express certain cloned genes. Nevertheless, the 
strategies that have been elaborated for E. coli, in principle, are applicable 
to all systems. 
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Gene Expression from Strong and Regulatable Promoters 

The minimal requirement for an effective gene expression system is the 
presence of a strong and regulatable promoter sequence upstream from a 
cloned gene. A strong promoter is one that has high affinity for RNA poly¬ 
merase, with the consequence that the adjacent downstream region is fre¬ 
quently transcribed. The ability to regulate a promoter enables the cell (and 
the researcher) to control the extent of transcription in a precise manner. 
The promoter from the well-studied lac (lactose) operon of E. coli has been 
used extensively for expressing cloned genes. However, other promoters 
have distinctive properties that make them useful for controlling expres¬ 
sion. Thus, many different promoters have been isolated from a range of 
organisms. 

It might seem that a good way to optimize the expression of a cloned 
gene would be to insert it into a plasmid under the control of a continu¬ 
ously activated strong promoter. However, a high level of continuous 
expression of a cloned gene is often detrimental to the host cell because it 
creates an energy drain, thereby impairing essential host cell functions. In 
addition, all or a portion of the plasmid carrying a continuously (constitu- 
tively) expressed cloned gene may be lost after several division cycles, 
since cells without a plasmid grow faster and eventually take over the cul¬ 
ture. Such plasmid instability is a major problem that may prevent the 
efficient production of a plasmid-bome gene product on a large scale. To 
overcome this drawback, it is desirable to control transcription in such a 
way that a cloned gene is expressed only at a specific stage in the host cell 
growth cycle and only for a specified duration. This objective is achieved 
by using a strong regulatable promoter. The plasmids constructed to 
accomplish this task are called expression vectors. 

Regulatable Promoters 

The most widely used strong regulatable promoters are those from the E. 
coli lac and trp (tryptophan) operons; the tac promoter, which is constructed 
from the -10 region (i.e., 10 nucleotide pairs upstream from the site of ini¬ 
tiation of transcription) of the lac promoter and the -35 region of the trp 
promoter; the leftward, or p L , promoter from bacteriophage X; and the gene 
10 promoter from bacteriophage T7. Each of these promoters interacts with 
regulatory proteins (e.g., repressors or inducers), which provide a control¬ 
lable switch for either turning on or turning off specific transcription of 
adjacent cloned genes. In addition, each of these promoters is recognized 
by the major form of the E. coli RNA polymerase holoenzyme. This holoen- 
zyme is formed when a protein, called sigma factor, combines with the core 
proteins of RNA polymerase. Sigma factor directs the binding of the 
holoenzyme to promoter regions on the DNA. 

In the absence of lactose in the growth medium, the E. coli lac promoter 
is repressed (turned off) by the lac repressor protein, which prevents the lac 
operon from being transcribed (Fig. 6.1). Induction (turning on) of the lac 
promoter is achieved by the addition of either lactose or isopropyl-p-D- 
thiogalactopyranoside (IPTG), a synthetic inducer, to the medium (Fig. 6.2). 
Either of these substances prevents the lac repressor from binding to the lac 
operator, thereby enabling transcription to occur. In practice, lactose must 
be converted to allolactose, by low levels of (3-galactosidase that are synthe¬ 
sized when the system is repressed, before it can act as an inducer. The 
enzyme p-galactosidase is encoded by the lacZ gene of the lac operon, and 
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FIGURE 6.1 Diagrammatic representation of the effects of the concentrations of glu¬ 
cose, lactose, and cAMP in the growth medium on the level of transcription from 
the E. coli lac promoter. The arrow indicates the direction of transcription. The lac 
repressor is a tetramer. The cAMP-CAP complex binds to a CAP recognition site 
(CAP box) on the DNA. 


it is primarily involved in the cleavage of lactose into glucose and galac¬ 
tose. 

Transcription from the lac promoter is also regulated by the binding of 
the catabolite activator protein (CAP) (also sometimes referred to as the 
cyclic AMP [cAMP] repressor protein, or CRP) to a region of the DNA (the 
CAP box) just upstream of the promoter region (Fig. 6.1). When CAP binds 
to the CAP box, it increases the affinity of the promoter for RNA poly¬ 
merase, thereby increasing transcription of the genes downstream from the 
promoter. The affinity of CAP for its binding site on the DNA is enhanced 
by its association with cAMP, whose level is highest when the amount of 
glucose in the medium is lowest. Thus, when inducer (lactose or IPTG) is 
present and there is no repressor bound to the operator, a high intracellular 
concentration of cAMP can lead to a high level of transcription of the genes 
downstream of the lac promoter. In practice, lacUV5, a variant of the lac 
promoter that contains an altered nucleotide sequence in the -10 region 
and is a stronger promoter than the wild-type lac promoter, is usually used 
in plasmid expression vectors. 

The trp promoter regulates transcription of the genes that are necessary 
for the biosynthesis of the amino acid tryptophan. This strong promoter is 
negatively regulated (turned off) by the trp repressor protein complexed 
with tryptophan, which binds to the trp operator and prevents transcrip¬ 
tion of the trp operon. Derepression (turning on) of the trp promoter is 
achieved either by removing tryptophan or by adding 3-indoleacrylic acid 
to the growth medium. Unfortunately, repression of this promoter is not 
very efficient: it is "leaky," which leads to a continuous low level of tran¬ 
scription, even when the gene should be turned off. Because of this, this 
promoter-operator cannot be used to express genes that might be toxic or 
otherwise deleterious to the growth of E. coli. 

The tac and trc promoters are commonly used hybrid constructs that are 
similar to one another, differing by only a single base pair. Both promoters 
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FIGURE 6.2 Inducers of the lac promoter. (A) Lactose, which must be converted to 
allolactose to be effective; (B) IPTG. 


contain the -35 region from the trp promoter and the -10 region from the lac 
promoter separated by either 16 (lac) or 17 (trc) base pairs (bp). Both of the 
promoters are repressed by the lac repressor and can be induced (in the same 
way as the lac promoter) by the addition of lactose or IPTG to the medium. 
It has been estimated that the tac and trc promoters are three times as strong 
as the trp promoter and 10 times more effective than the lac promoter. 

The p L promoter is controlled by the cl repressor protein of bacterio¬ 
phage A (Fig. 6.3). In practice, a temperature-sensitive mutant of the cl 
repressor, cI857, is generally used to regulate p [ -directed transcription. 
Cells carrying the temperature-sensitive cl repressor are first grown at 28 to 
30°C, a temperature at which the cl repressor prevents transcription 
directed by the p L promoter. When the cell culture reaches the desired stage 
of growth, often the mid-log phase, the temperature is shifted to 42°C. At 
this temperature, the thermosensitive cl repressor is inactivated and tran¬ 
scription can proceed. 

The bacteriophage T7 gene 10 promoter requires T7 RNA polymerase 
for transcription (Fig. 6.4). To utilize this promoter, the T7 RNA polymerase 
gene is inserted in the E. coli chromosome on a bacteriophage A lysogen 
under the control of the E. coli lac promoter. After cells are transformed by 
a plasmid with a cloned gene under the control of the T7 promoter, IPTG is 
added to the medium. Under these conditions, the T7 RNA polymerase 
gene is induced and synthesized, and the T7 RNA polymerase transcribes 
the cloned gene. There is often a lag of an hour or more from the time that 
the T7 RNA polymerase gene is induced until the cloned (target) gene is 
transcribed. A series of plasmids called pET vectors have been developed 
to exploit the strength of the T7 promoter. 

The effectiveness of deactivating a repressor protein and thereby acti¬ 
vating transcription depends on the ratio of the number of repressor pro¬ 
tein molecules to the number of copies of the promoter sequences. If there 
are too many repressor protein molecules, then it is difficult to induce tran- 
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FIGURE 6.3 Regulation of gene expression controlled by the p L promoter. (A) At 
30°C, the cl repressor, which is synthesized constitutively under the control of its 
own promoter ( p cl ), binds to the operator region (o L ) of the p L promoter, thereby 
preventing the target gene from being transcribed. (B) At 42°C, the temperature- 
sensitive cl repressor is synthesized and then inactivated so that it no longer inter¬ 
feres with transcription of the target protein. TT, transcription termination 
sequence. 


scription. Conversely, with too few repressor protein molecules, even when 
there are more repressor molecules than copies of the promoter, transcrip¬ 
tion occurs in the absence of induction. In these cases, the promoters are 
said to be leaky. Various means have been devised to keep these regulatable 
systems under complete control. For example, the repressor protein gene 
and the promoter that it regulates may be placed on two different plasmids 
that maintain different numbers of copies per cell; this arrangement main¬ 
tains the appropriate ratio between the repressor protein and the promoter. 
Usually, the repressor gene is placed on a low-copy-number plasmid that 
maintains about 1 to 8 copies per cell, and the cloned gene with its pro¬ 
moter sequence is inserted into a high-copy-number plasmid that main¬ 
tains about 30 to 100 copies per cell. Alternatively, the repressor protein 
gene may be carried as a single gene in the chromosomal DNA, an arrange¬ 
ment that keeps repressor protein levels low. In systems that use the lac 
promoter, a mutant form of the lacl gene (/ncl q ) that produces much higher 
levels of the lac repressor is often used to decrease transcriptional leakiness 
under noninduced conditions, i.e., transcription of a cloned gene in the 
absence of inducer. 
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FIGURE 6.4 Regulation of gene expression controlled by the promoter for gene 10 
from bacteriophage T7 ( p T7 ). In the absence of the inducer IPTG, the constitutively 
produced lac repressor, the product of the lacl gene, which is under the control of 
the lacl promoter, p lad , represses the synthesis of the T7 RNA polymerase that is 
transcriptionally controlled by the lac operator (o ,ac ) and lac promoter (p lac ). In the 
absence of T7 RNA polymerase, the target gene, which is under the transcriptional 
control of the T7 gene 10 promoter (p T7 ), is not transcribed. When lactose or IPTG is 
added to the medium, it binds to the lac repressor, thereby preventing it from 
repressing the transcription of T7 RNA polymerase. In the presence of T7 RNA 
polymerase, the target gene is transcribed. TT, transcription termination sequence. 


It is generally believed that the spacer region between the -35 and the 
-10 regions of £. coli promoters does not have any specific sequence 
requirements and acts primarily to position the binding sites for optimal 
interaction between the sigma factor of RNA polymerase and the promoter 


FIGURE 6.5 A portion of the DNA sequence of the E. coli lac promoter (p ,ac ) and its 
mutated, more active, form (p mut ). The -10 and -35 residues are indicated by aster¬ 
isks. The -10, -20 to -13, and -35 regions are indicated. 
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DNA. However, this spacer region can in fact contribute to the strength of 
the promoter. When a portion of the spacer region from the E. coli lac pro¬ 
moter was deliberately mutated and constructs that yielded the most active 
promoters were selected, one construct displayed a >40-fold increase in lac 
promoter-directed RNA synthesis in the absence of CRP (and an ~8-fold 
increase in the presence of CRP, where CRP stimulated RNA synthesis 
10-fold). In the mutated promoter with increased activity, the -20 to -13 
region of the promoter was altered from a GC-rich (7 out of 8 bp) to an 
AT-rich (7 out of 8 bp) region (Fig. 6.5). Moreover, when the -20 to -13 
region was excised from the mutated lac promoter and inserted into the p R 
promoter from bacteriophage A,, transcription was enhanced twofold. This 
is important, because while the lac promoter is normally relatively weak in 
the absence of CRP, the p R promoter is one of the strongest promoters in E. 
coli. Thus, by substituting an AT-rich region for a GC-rich region, and pos¬ 
sibly enabling the promoter region to form a stronger complex with RNA 
polymerase, the intrinsic strengths of various promoters, even very strong 
ones, may be increased. These altered promoters may then be utilized as 
parts of an expression vector. 

Increasing Protein Production 

Plasmid pCP3 (Fig. 6.6) was created in an effort to obtain the highest pos¬ 
sible level of foreign-protein production in a recombinant £. coli strain. This 
plasmid contains the strong p L promoter, the [3-lactamase gene (ampicillin 
resistance gene) as a selectable marker, a multiple cloning sequence imme¬ 
diately downstream from the promoter, and a temperature-sensitive origin 
of DNA replication that increases the plasmid's copy number 5- to 10-fold 
when the growth temperature is increased to 42°C (Fig. 6.7). 

£. coli cells that carry the plasmid pCP3 are first grown at 28°C and 
then shifted to 42°C. At the lower temperature, the cl repressor, which is 
integrated into the host £. coli chromosomal DNA, is functional, the p L pro¬ 
moter is turned off, and the plasmid copy number is normal (about 60 
copies per cell). At the higher temperature, the temperature-sensitive cl 
repressor is inactivated, the p L promoter is active, and the plasmid copy 
number increases to around 600 copies per cell. These properties make 
pCP3 a particularly effective expression vector. When the gene for the 
enzyme T4 DNA ligase is inserted into the multiple cloning site of pCP3, 
about 20% of the cellular protein produced at 42°C is T4 DNA ligase. This 
level of expression is much higher than that for even the most abundant 
native £. coli proteins, such as the elongation factor EF-Tu, which have an 
expression level of about 2%. 

Large-Scale Systems 

In small culture vessels (1 to 5 liters), induction is readily achieved either by 
shifting the temperature or by adding a chemical inducer. In pilot plant-size 

FIGURE 6.6 Linear representation of E. coli expression vector pCP3 with a tempera¬ 
ture-sensitive origin of DNA replication (on ls ) that causes an increase in the plasmid 
copy number at 42°C and a p L promoter controlled by a temperature-sensitive 
repressor protein, a multiple cloning site (MCS), and an ampicillin resistance 
(Amp 1 ) gene. 


MCS 











202 


CHAPTER 6 



FIGURE 6.7 E. coli cells that contain a plasmid with a temperature-sensitive origin of 
DNA replication contain a limited number of plasmid copies at 28°C, which is 
greatly amplified when the temperature is shifted to 42°C. 


(20 to 200 liters) and industrial-size (>200 liters) bioreactors, however, a shift 
in temperature requires time (30 to 60 minutes) and energy, both of which 
can be costly. Similarly, the cost of a chemical inducer, such as IPTG, that is 
required for the expression of a cloned gene in a large-scale bioreactor can 
make the overall process uneconomical. To overcome some of the problems 
associated with the use of the p L promoter for large-scale fermentations, a 
two-plasmid system has been developed. The cl repressor was placed under 
the control of the trp promoter and inserted into a low-copy-number 
plasmid (Fig. 6.8). The use of a low-copy-number plasmid ensures that 
excess cl repressor molecules are not produced. A second plasmid carries a 
cloned gene under the control of the p L promoter. As shown in Fig. 6.8A, the 
trp promoter is turned on in the absence of tryptophan, so the cl repressor 
protein is synthesized and the p L promoter is turned off. In contrast, as 
shown in Fig. 6.8B, the trp promoter is turned off in the presence of trypto¬ 
phan, so the cl repressor protein is not synthesized and the p L promoter is 
fully active. 

With this two-plasmid system, cells can be grown on an inexpensive 
medium consisting of molasses and casein hydrolysate, which contains 
only very small amounts of free tryptophan, and then induced to express 
the cloned gene by the addition of tryptone to the medium. Tryptone con¬ 
tains enough free tryptophan for efficient induction of transcription. In trial 
runs of this system, cloned (3-galactosidase and citrate synthase genes, after 
induction by addition of tryptone to the medium, represented 21 and 24% 
of the cellular protein, respectively. Thus, this system provides a potentially 
inexpensive means of producing proteins from recombinant microorgan¬ 
isms on a large scale. 

Several considerations may limit the choice of promoters for the large- 
scale production of foreign proteins. Chemical inducers can be costly, toxic, 
or difficult to remove; thermal induction of promoters may induce the pro¬ 
duction of heat shock proteins, including proteases; nutrient promoters 
limit the types of media that can be used for cell growth and induction; and 
oxygen-regulated promoters often have significant basal levels of activity 
as a consequence of the inherent difficulty in precisely controlling dis¬ 
solved oxygen levels in the growth medium. On the other hand, promoters 
that are induced when cells enter stationary phase may be useful in the 
design of expression vectors that are useful for large-scale applications. 

In E. coli, the housekeeping RNA polymerase sigma factor, a D , recog¬ 
nizes and binds to a well-characterized consensus DNA sequence at 
approximately -10 and -35 bp (upstream) from the site where transcription 
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A No tryptophan in the medium 



B Tryptophan in the medium 


pfp 


cl protein 

not synthesized 

(repressed) 



Cloned 



FIGURE 6.8 Dual-plasmid system for controlling the X p L promoter by regulating the 
cl repressor with tryptophan. The tryptophan promoter (p'f) is inserted next to the 
cl gene on one plasmid, and the p L promoter is placed adjacent to the cloned gene 
(gene) on a second plasmid. The wavy arrows denote transcription. (A) With no 
tryptophan in the medium, the cl gene is transcribed and translated, and the cl 
repressor protein binds to the p L promoter, thereby blocking the transcription of the 
cloned gene. (B) In the presence of tryptophan, the cl gene is repressed, no cl 
product is made, and the cloned gene is transcribed and translated. 


is initiated (Fig. 6.9). The stationary-phase sigma factor, o s , recognizes a 
similar, but not identical, sequence of nucleotides in the same region 
upstream of the start of transcription. Despite the similarities between 
these two types of promoter sequences, the -35 region appears not to be 
important for the functioning of the known stationary-phase promoters. 
Assuming that the sequences around the -35 region affect promoter 
activity, in one study, researchers generated more than 150 different sta¬ 
tionary-phase promoters in which the DNA sequences upstream of the -10 
consensus sequence were partially randomized. These workers found that 
a number of these synthetic promoters had three to four times the level of 
activity of naturally occurring stationary-phase promoters and had no, or 
only a very low level of, background gene expression during exponential 
growth. Thus, it is possible to use these promoters as parts of an expression 
vector. In this case, the cells would be grown to a high density without 
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-35 region 


-10 region 


AAWWTWTTTTNNNAAANNNNN - TTGACA 
AW WW WWTTTTNNNNNNNNNNN - TTGAC A 


-12 bp -TGTG 
-12 bp - TGTG 


NTATAAT 

CTATACT 


-4-9 bp 
- AT-rich 


FIGURE 6.9 Consensus sequences of o D -dependent (top) and o s -dependent (bottom) 
promoters. The -10 region is shown, including the -13 residue, since it is important 
for g s binding/recognition. The AT-rich region downstream of the -10 region and 
the AWWWWWTTTT sequence contribute to o s selectivity. N indicates any one of 
A, C, G, or T, whereas W indicates an A or a T at that position. The -10 and -35 
residues are indicated by an asterisk. Modified from Miksch et al., /. Biotechnol. 
120:25-37, 2005. 


expressing a foreign gene of interest, and as the cells entered stationary 
phase, gene expression would be rapidly and efficiently induced. Despite 
its success on a laboratory scale, the effectiveness of this system remains to 
be demonstrated on a large scale. 

Expression in Other Microorganisms 

E. coli is not necessarily the microorganism of choice for the expression of 
all foreign proteins. However, our understanding of the genetics and 
molecular biology of most other microorganisms is not as well developed. 
Unfortunately, there is no one vector or promoter-repressor system that 
gives optimal levels of gene expression in all bacteria, or even in all gram¬ 
negative bacteria. Fortunately, many of the strategies that have been devel¬ 
oped for E. coli are also useful with a variety of other microorganisms. With 
this in mind, the abilities of various promoters to support transcription in 
other gram-negative bacteria have been tested. In one study, a set of 
plasmid expression vectors containing either the lac, tac, Nm (from the 
neomycin resistance gene), or SI (from the ribosomal protein SI gene of 
Sinorhizobium meliloti) promoter was constructed. The expression of 
p-galactosidase under the control of each of these promoters was examined 
(Table 6.1). The results indicated that (1) all of the promoters were active to 
some extent in each of the bacteria tested, (2) the tac promoter was the most 
active promoter in £. coli and the least active promoter in the other bacteria, 
and (3) Nm was the second least active promoter in £. coli and the most 
active promoter in the other bacteria. Clearly, even though gram-negative 
organisms may utilize similar DNA sequences to promote transcription, 
the best promoter for use in a particular organism is not necessarily the one 
that is most efficient in £. coli. Nevertheless, depending on the application, 
known £. coli promoters may be satisfactory for regulating the expression 
of cloned genes in other gram-negative bacteria. 

Lactic acid bacteria, e.g., Lactococcus spp., are widely used in the pro¬ 
duction of foods such as cheese and yogurt. Genetic manipulation of these 
bacteria is desirable to increase yields or add to the quality of the product. 
However, any changes must not affect the production process, product 
palatability and appearance, or other features. For these reasons, it is not 
acceptable to modulate gene expression by adding chemical inducers or 
significantly modifying process conditions, such as temperature. In addi¬ 
tion, the range of attainable foreign-gene expression is intrinsically limited 
when only the available constitutive promoters are used. To overcome this 
limitation, a plasmid library of synthetic promoters for Lactococcus lactis 
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TABLE 6.1 (3-Galactosidase activity expressed by gram-negative bacteria carrying a plasmid vector 
with the E. coli lacZ gene and a heterologous promoter 


[i-Galactosidase activity (U) 


Promoter 

Escherichia coli 

Sinorhizobium 

meliloti 

Rhizobium 

teguminosarum 

Pseudomonas putida 

None 

16 

110 

130 

150 

Nm 

1,400 

21,800 

13,900 

16,300 

lac 

2,000 

9,050 

6,250 

9,800 

tac 

11,300 

2,850 

1,150 

2,950 

SI 

40 

3,300 

1,200 

3,350 


Adapted from Labes et al.. Gene 89:37-46, 1990. 


was constructed in which the -10 and -35 regions of each promoter were 
the same but the sequences of the intervening spacer nucleotides were ran¬ 
domized (Fig. 6.10). To assay the strengths of the individual constructs, 
each synthetic promoter was used to control the expression, in the related 
bacterium Leuconostoc lactis, of the lacL and lacM genes, which together 
encode (3-galactosidase. To quantify the strengths of the various promoters, 
the (3-galactosidase activity from each construct was measured. Of the 36 
different constructed promoters that were tested, the most active promoter 
was approximately 7,000 times stronger than the least active promoter. 
DNA sequence analysis revealed that the least active promoters all had 
changes in the -10 or -35 consensus region. Of the promoters in which the 
-10 or -35 consensus region remained intact, there was a 400-fold variation 
in promoter strength. These results indicated that the sequence of the 
spacer region is important for promoter activity. With this panel of pro¬ 
moters, it is possible to fine-tune the expression of different genes that are 
introduced in L. lactis and, by extension of the principle, the expression of 
genes introduced in other organisms, as well. 


Fusion Proteins 

Often, foreign proteins, especially small ones, occur in minute quantities 
when they are produced in heterologous host cells. This apparently low 
level of expression is, in many instances, actually due to degradation of the 
foreign protein. One way to solve this problem is to engineer a DNA con¬ 
struct that encodes a target protein that is in frame with a stable host pro¬ 
tein. This combined, single protein, which is called a fusion protein, 
protects the cloned gene product from attack by host cell proteases. In a 
number of studies, proteins synthesized from cloned genes have been 
found to be resistant to degradation when they are part of a fusion protein, 
whereas when they are expressed as separate proteins, they are susceptible 
to degradation by proteolytic enzymes (proteolysis). In general, fusion 


FIGURE 6.10 Consensus oligonucleotide sequence of L. lactis constitutive promoters. 

The -10 and -35 regions are shown in red, and the spacer region is shown in blue. 

To generate the library of synthetic promoters, the N sites were composed of 25% 

(each) A, C, G, and T; R was 50% A and 50% G; and W was 50% A and 50% T. 

5'- CAtNNNNNAGTT 1 WTCTTG AC ANNNNNNNN^ TAC TG T T 

-35 -10 
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proteins are stable because the target proteins are fused with proteins that 
are not especially susceptible to proteolysis. 

Fusion proteins are constructed at the DNA level by ligating a portion 
of the coding regions of two or more genes. In its simplest form, a fusion 
vector system entails the insertion of a target gene or gene segment into the 
coding region of a cloned host gene. Knowledge of the nucleotide sequences 
of the various coding segments that are joined at the DNA level is essential 
to ensure that the ligation product maintains the correct reading frame. If 
the combined DNA has an altered reading frame, i.e., a sequence of succes¬ 
sive codons that yields either an incomplete or an incorrect translation 
product, then a functional version of the protein encoded by the cloned 
target gene will not be produced. Many strategies have been devised to 
ensure that a proper reading frame is achieved. 

Uses of Fusion Proteins 

For some applications, a fusion protein can be a satisfactory end product. 
For example, a specific antigenic site that is required in large amounts and 
is part of a fusion protein may be used for research or diagnostic purposes 

FIGURE 6.11 A fusion protein cloning vector. The plasmid contains an ampicillin 
resistance (Amp 1 ) gene as the selectable marker, a DNA sequence encoding the 
N-terminal segment of the E. coli outer membrane protein (ompF), a restriction 
endonuclease site (Abel) for cloning, and a truncated p-galactosidase gene (lacZ). 
The cloned gene (Gene) is inserted into the Abel site. After transcription and trans¬ 
lation, a tribrid protein is produced consisting of OmpF-target protein-LacZ. 



Ligate 



Abel Gene Abel 


lacZ 
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as long as the stabilizing protein does not interfere with the correct folding 
of the antigenic site. In this case, the fusion protein can be used as an 
antigen, and any antibodies that are directed against the stabilizing protein 
can be removed by absorption with this protein alone, thus leaving in the 
antiserum only those antibodies that bind to the targeted protein 
sequence. 

In one instance, a fusion cloning vector that included the 5'-terminal 
segment of the E. coli ompF gene, which directs the synthesis of an outer 
membrane protein, and a portion of the E. coli lacZ ((3-galactosidase) gene 
was constructed and used to generate antibodies against selected target 
proteins (Fig. 6.11). The ompF gene segment contributed the signals for the 
initiation of both transcription and translation and for secretion of the 
fusion protein. Even though the truncated lacZ gene lacks the codons for 
the first 8 amino acids, the shortened protein encoded by this gene frag¬ 
ment is still enzymatically active. This form of the enzyme [3-galactosidase 
is able to function with almost any peptide fused to its N terminus. The 
lacZ gene was cloned on the vector at a location that put it in an altered 
reading frame with respect to the ompF leader sequence. Therefore, no 
functional (3-galactosidase was produced. However, any cloned target 
DNA that had both ompF and lacZ in frame would produce a three-part 
hybrid protein that comprised a portion of the OmpF amino acid sequence, 
the protein encoded by the cloned target gene, and the functional C-terminal 
portion of [3-galactosidase, whose activity is readily visualized on plates. 
Such a hybrid protein can be used either as an antigen to produce anti¬ 
bodies that will cross-react with the protein encoded by the cloned gene or 
as a means of producing large amounts of small, important portions of 
specific proteins. 

In addition to reducing the degradation of polypeptides, a number of 
fusion proteins have been developed to simplify the purification of recom¬ 
binant proteins (Table 6.2). This approach is useful for purification of pro¬ 
teins expressed in either prokaryotic or eukaryotic host organisms. For 


TABLE 6.2 Some protein fusion systems used to facilitate the purification of foreign proteins in E. coli and other host organisms 


Fusion partner 

Size 

Ligand 

Elution conditions 

ZZ 

14 kDa 

Immunoglobulin G 

Low pH 

Histidine tail 

6-10 amino acids 

Ni 2+ 

Imidazole 

Strep tag 

10 amino acids 

Streptavidin 

Iminobiotin 

Pinpoint 

13 kDa 

Streptavidin 

Biotin 

Maltose-binding protein 

40 kDa 

Amylose 

Maltose 

GST 

26 kDa 

Glutathione 

Reduced glutathione 

Flag 

8 amino acids 

Specific monoclonal antibody 
(MAb) 

EDTA or low pH 

Poly-arginine 

5-6 amino acids 

SP-Sephadex 

High salt at pH >8.0 

c-myc 

11 amino acids 

Specific MAb 

Low pH 

Stag 

15 amino acids 

S fragment of RNase A 

Low pH 

Calmodulin-binding peptide 

26 amino acids 

Calmodulin 

EGTA and high salt 

Cellulose-binding domain 

4-20 kDa 

Cellulose 

Urea or guanidine hydrochloride 

Chitin-binding domain 

51 amino acids 

Chi tin 

SDS or guanidine hydrochloride 

SBP tag 

38 amino acids 

Streptavidin 

Biotin 


ZZ, a fragment of Staphylococcus aureus protein A; Strep tag, a peptide with affinity for streptavidin; Pinpoint, a protein fragment that is biotinylated and binds 
streptavidin; GST, glutathione S-transferase; Flag, a peptide recognized by enterokinase; c-myc, a peptide from a protein that is overexpressed in many cancers; 
S tag, a peptide fragment of ribonuclease (RNase) A; SBP (streptavidin-binding protein), a peptide with affinity for streptavidin; SP-Sephadex, a cation-exchange 
resin composed of sulfopropyl groups covalently attached to Sephadex beads. 




208 CHAPTER 6 


example, a vector that contains the human interleukin-2 gene joined to DNA 
encoding the marker peptide sequence Asp-Tyr-Lys-Asp-Asp-Asp-Asp-Lys 
has the dual function of reducing the degradation of the expressed inter- 
leukin-2 gene product and then enabling the product to be purified. 
Interleukin-2 is a biological factor that stimulates both T-cell growth and 
B-cell antibody synthesis. Following expression of this construct (Fig. 6.12), 
the secreted fusion protein can be purified in a single step by immunoaf- 
finity chromatography, in which monoclonal antibodies against the marker 
peptide have been immobilized on a polypropylene (or other solid) support 
and act as ligands to bind the fusion protein (Fig. 6.13). Because the marker 
peptide is relatively small, it does not significantly decrease the amount of 
host cell resources that are available for the production of interleukin-2; 
thus, the yield of interleukin-2 is not affected by the concomitant synthesis 
of the marker peptide. In addition, while the fusion protein has the same 
biological activity as native interleukin-2, to satisfy the government agen¬ 
cies that regulate the use of pharmaceuticals, it is still necessary to remove 
the marker peptide if the product is to be used for human immunotherapy 
or other medical purposes. In this system, the marker sequence may be spe¬ 
cifically removed by treatment of the fusion protein with bovine intestinal 
enterokinase (which is a highly specific protease, despite its name). 

In many instances antigen-antibody complexes that form during the 
immunoaffinity process are difficult to separate without the use of dena¬ 
turing chemicals. As an alternative, it has become very popular to generate 
a fusion protein containing six or eight histidine residues attached to either 
the N- or C-terminal end of the target protein. The histidine-tagged protein, 
along with other cellular proteins, is then passed over an affinity column of 
nickel-nitrilotriacetic acid. The histidine-tagged protein, but not the other 
cellular proteins, binds tightly to the column. The bound protein is eventu¬ 
ally eluted from the column by the addition of imidazole (the side chain of 
histidine). With this protocol, some cloned and overexpressed proteins 
have been purified up to 100-fold with greater than 90% recovery in a 
single step. In addition, this system can be utilized to purify denatured 
proteins, for example, following solubilization of inclusion bodies and 
before the solubilized proteins are renatured. 

Cleavage of Fusion Proteins 

Depending on its end use, it may be undesirable to produce a fusion pro¬ 
tein as the final product. For example, the presence of the host protein seg¬ 
ment makes most fusion proteins unsuitable for clinical use and may affect 
the biological functioning of the target protein. In addition, fusion proteins 
require more extensive testing before being approved by regulatory agen- 

FIGURE 6.12 Schematic representation of the genetic construct used to produce a 
secreted fusion protein consisting of a marker peptide and interleukin-2. The amino 
acid sequence of the marker peptide is shown. 
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FIGURE 6.13 Immunoaffinity chromatographic purification of a fusion protein. An 
antibody that binds to the marker peptide of the fusion protein (anti-marker pep¬ 
tide antibody) is attached to a solid polypropylene support. The secreted proteins 
are passed through the column containing the bound antibody. The marker peptide 
portion of the fusion protein is bound to the antibody and the other proteins pass 
through. The immunopurified fusion protein can then be selectively eluted from 
the column by the addition of pure marker peptide. 


cies, such as the U.S. Food and Drug Administration. Thus, strategies have 
been developed to remove the unwanted amino acid sequence from the 
target protein. One way to do this is to join the gene for the target protein 
to all or a portion of the gene for another protein (the stabilizing fusion 
partner) with oligonucleotides that encode short stretches of amino acids 
that are recognized by a specific nonbacterial protease. The oligonucleotide 
linkers that code for the protease recognition site may be ligated upstream 
of the cloned gene (so that the linker peptide will be synthesized at the 
N-terminual end of the target protein) before the construct is inserted into 
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a fusion expression vector. For example, an oligonucleotide linker encoding 
the amino acid sequence Ile-Glu-Gly-Arg can be joined to the cloned gene. 
Following synthesis and purification of the fusion protein, a blood coagula¬ 
tion factor called X a can be used to release the target protein from the fusion 
partner, because factor X a is a specific protease that cleaves peptide bonds 
uniquely on the C-terminal side of the Ile-Glu-Gly-Arg sequence (Fig. 6.14). 
Moreover, because this peptide sequence occurs rather infrequently in 
native proteins, this approach can be used to recover many different cloned 
gene products. 

The proteases most commonly used to cleave a fusion partner/ affinity 
tag from a protein of interest are enterokinase, tobacco etch virus protease, 
thrombin, and factor X a . Flowever, following this cleavage, it is necessary 
to perform additional purification steps in order to separate both the pro¬ 
tease and the fusion protein from the protein of interest. In addition, pro¬ 
teases may cleave the protein of interest at unintended sites, and the 
cleavage reaction may not go to completion, leaving a portion of the pro¬ 
tein of interest still attached to its fusion partner. One way around this 
problem is through the use of self-splicing inteins. An intein may be 
defined as an internal segment of a protein that, under specific conditions, 
catalyzes its own cleavage into two separate polypeptides. Of the more 
than 100 known inteins, the majority contain a cysteine or serine residue at 
the N-terminal side of an asparagine residue, with cleavage occurring 
between these two amino acids. When the N-terminal residue at the 
cleavage site is cysteine, protein chain cleavage may be initiated by the 
addition of a sulfhydryl reagent, such as dithiothreitol. This approach 
typically includes a gene fusion with the gene of interest (encoding the 
target protein), an intein, and a protein tag (Fig. 6.15). One inexpensive way 
to commercialize this system is to utilize a chitin-binding domain as the 
protein tag. In practice, a mixture of cellular proteins is passed through a 
chitin column to which only the fusion protein binds. The rest of the pro¬ 
teins pass through the column. Then, the column is treated with dithiothre¬ 
itol, the protein of interest is cleaved at the intein junction and eluted, and 
the inexpensive used column is discarded. This system has been used to 
purify Cre recombinase (a tool for chromosome engineering), a-l-antitrypsin 
(a therapeutic protein), and basic human fibroblast growth factor (a poten¬ 
tial therapeutic protein). 

Surface Display 

Specialized fusion protein systems have been devised for screening com¬ 
plementary DNA (cDNA) libraries that contain very large numbers of dif- 


FIGURE 6.14 Proteolytic cleavage of a fusion protein by blood coagulation factor X a . 
The factor X a recognition sequence (X a linker sequence) lies between the amino acid 
sequences of two different proteins. A functional cloned-gene protein (with Val at 
its N terminus) is released after cleavage. 
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Manipulation of Gene Expression in Prokaryotes 


211 


ferent clones (sometimes up to 5 x 10 10 ) for proteins that are encoded by 
rarely occurring cDNAs. Generally, for these libraries, cDNA molecules are 
cloned into a surface protein (filament protein, or pilus protein) gene of 
either a filamentous bacteriophage, such as M13, or a bacterium. After tran¬ 
scription and translation, the fusion protein is incorporated into a surface 
structure of the bacteriophage or bacterium, where it can be identified by 
an immunological assay. More specifically, fusions are often made with the 
protein pill gene from bacteriophage M13. Protein pill is normally found 
at the tip of this tubular bacteriophage and is responsible for initiating 
phage infection of E. coli by binding to F pili. A plasmid that contains a 
small piece of M13 DNA that allows the plasmid (phagemid) to be pack¬ 
aged in vitro into Ml 3 phage particles; a protein pill gene under the control 
of a regulatable bacterial promoter, such as the E. coli lac promoter; and a 
cloning site near the 5' end of the pill gene for insertion of a cDNA or other 
coding sequence is constructed. The expressed target protein is fused to 
bacteriophage M13 protein pill near its N terminus (Fig. 6.16). After M13 
replication in £. coli cells, the plaques are assayed immunologically using 
antibodies that detect the presence of the target protein. Recombinant 
phagemids isolated from positive plaques can then be used as a source of 
the target cDNA. This is an extremely powerful selection system that has 
the capability of finding cDNAs for very rarely expressed but important 
proteins. Finally, it is also possible, although less straightforward and 
therefore less common, to create recombinant phage in which a target pep¬ 
tide is fused to protein pVIII, the bacteriophage coat protein. 

An alternative to the phage surface display of proteins described 
above is the use of libraries with bacterial surface structures composed of 
fusion proteins that can be screened for clones that carry specific coding 
sequences. To export proteins to the surface of a gram-negative bacterium, 
such as E. coli, fusions between the genes for the target protein and for an 
outer surface protein are created. Bacterial fusion partners that have been 
used in these types of constructs include outer membrane protein A 
(OmpA) and peptide-glycan-associated lipoprotein (PAL) from E. coli, as 
well as Pseudomonas aeruginosa outer membrane protein F (OprF). With 
most bacterial surface fusion proteins, the target protein is located at either 
the N or C terminus of the fusion protein. However, in some instances, 
short stretches of a target protein can be expressed in the middle of the 
fusion partner (Fig. 6.17). 

In addition to facilitating the screening of large cDNA libraries, sur¬ 
face-displayed proteins can provide an effective means of overexpressing 



FIGURE 6.15 (A) Purification of a protein 
of interest (POI) from an intein-con- 
taining fusion protein bound to a chitin 
chromatography column through a 
chi tin-binding domain (CBD). Cleavage, 
which is indicated by the arrow, occurs 
upon the addition of dithiothreitol. (B) 
Products following intein cleavage, 
including the released protein of 
interest. 


FIGURE 6.16 Recombinant bacteriophage M13 displaying a fusion protein consisting 
of the target protein and phage protein pill. 
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FIGURE 6.17 Fusion protein between bacterial outer surface protein and a foreign 
target protein joined to either the N or C terminus (A) or foreign peptides inserted 
into the surface-exposed loops (B). In both cases, the foreign peptide or protein is 
directed to the outer surface of the bacterial cell. 



peptides or proteins. For example, in one study, the amino-acid-repeating 
epitope (antigenic determinant) Asn-Ala-Asn-Pro of a protein from the 
parasite Plasmodium falciparum, the causative agent of malaria, was inserted 
into the regions that encode surface-exposed loops of the major outer mem¬ 
brane protein from P. aeruginosa (OprF). Whole bacterial cells expressing 
this fusion protein reacted positively when challenged with monoclonal 
antibodies against P. falciparum. It may, therefore, be possible to use some 
surface-displayed fusion proteins as vaccines (see chapter 11). 


Translation Expression Vectors 

Putting a cloned gene under the control of a regulatable, strong promoter, 
although essential, may not be sufficient to maximize the yield of the 
cloned gene product. Other factors, such as the efficiency of translation and 
the stability of the newly synthesized cloned-gene protein, may also affect 
the amount of product. 

In prokaryotic cells, various proteins are not necessarily synthesized 
with the same efficiency. In fact, they may be produced at very different 
levels (up to several hundredfold) even if they are encoded within the same 
polycistronic messenger RNA (mRNA). Differences in translational effi¬ 
ciency and in transcriptional regulation enable the cell to have hundreds or 
even thousands of copies of some proteins and only a few copies of 
others. 

In part, the molecular basis for differential translation is the presence 
of a translational initiation signal called a ribosome-binding site in the tran- 
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scribed mRNA. A ribosome-binding site is a sequence of 6 to 8 nucleotides 
(e.g., UAAGGAGG) in mRNA that can base pair with a complementary 
sequence (AUUCCUCC for E. coli) on the RNA component of the small 
ribosomal subunit. 

Generally, the stronger the binding of the mRNA to the ribosomal 
RNA, the greater the efficiency of translational initiation. For this reason, 
many E. coli expression vectors have been designed to ensure that the 
mRNA of a cloned gene contains a strong ribosome-binding site. In effect, 
this means that heterologous prokaryotic and eukaryotic genes can be 
translated readily in E. coli. However, certain other conditions must be sat¬ 
isfied for this approach to function properly. First, the ribosome-binding 
sequence must be located a precise distance from the translational start 
codon of the cloned gene. (At the RNA level, the translational codon is usu¬ 
ally AUG. At the DNA level, the strand that has the ATG sequence is called 
the coding strand, and the complementary strand, which acts as the tem¬ 
plate for transcription, is the noncoding strand. By convention, a start 
codon at the DNA level is designated ATG.) Second, the DNA sequence 
that includes the ribosome-binding site through the first few codons of the 
gene of interest must not contain nucleotide sequences that have regions of 
complementarity and therefore after transcription can fold back, i.e., form 
intrastrand loops (Fig. 6.18), thereby thwarting the interaction of the mRNA 
with the ribosome. The local secondary structure of the mRNA, which can 
either shield or expose the ribosome-binding site, determines the extent to 



MILESTONE 


The tac Promoter: a Functional Hybrid Derived from the 
trp and lac Promoters 

H. A. de Boer, L. J. Comstock, and M. Vasser 
Proc. Natl. Acad. Sci. USA 80 : 21 - 25,1983 


D e Boer and his colleagues 

began their efforts to construct 
the tac promoter with the idea 
of combining portions of two different 
strong and regulatable promoters to 
create an even stronger promoter that 
would direct very high levels of for¬ 
eign-gene expression. When they 
undertook their studies, although the 
DNA sequences of a number of 
prokaryotic promoters, mostly from E. 
coli, were known, the precise features 
that enabled a promoter to be efficient 
at directing transcription were not 
well understood. It was known that 
almost all mutations that affected the 
strength of a prokaryotic promoter 
were found in either the -10 region or 


the -35 region, which are approxi¬ 
mately 10 or 35 bp upstream of the 
mRNA transcription start site, respec¬ 
tively. Moreover, only mutations that 
made an existing promoter more like 
the consensus sequences for each of 
these regions, i.e., 5'-TATAAT-3' for 
the -10 region and 5'-TTGACA-3' for 
the -35 region, increased the strength 
of the promoter. The consensus 
sequences had been deduced by com¬ 
paring the DNA sequences of all 
known promoters and determining 
which nucleotides occurred most 
often, de Boer and his colleagues also 
knew that the lacUV5 promoter, which 
is a stronger variant of the lac pro¬ 
moter, had a consensus sequence at its 


-10 but not its -35 region, while the 
trip promoter, which normally controls 
the transcription of genes involved in 
the biosynthesis of tryptophan, has a 
consensus sequence at its -35 but not 
its -10 region. They decided to create 
a fusion promoter which included the 
-10 region from the lac promoter and 
the -35 region from the trp promoter. 
They tested this new "tac" promoter, 
as they called it, for its ability to direct 
the synthesis of the enzyme galactose 
kinase in E. coli and compared it in the 
same assay system with the trp and lac 
promoters. In agreement with their 
initial idea, the toe promoter was 
found to be approximately 5 times 
stronger than the trp promoter and 10 
times stronger than the lac promoter. 

In addition, like the lac promoter, the 
tac promoter was repressed by the lac 
repressor and derepressed by IPTG. 
Thus, this new promoter was not only 
strong, but also regulatable. 
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FIGURE 6.18 Example of secondary 
structure of the 5' end of an mRNA that 
would prevent efficient translation. The 
ribosome-binding site is GGGGG, the 
initiator codon is AUG (shown in red), 
and the first few codons are CAG- 
CAU-GAU-UUA-UUU. The mRNA is 
oriented with its 5' end to the left and 
its 3' end to the right. Note that in addi¬ 
tion to the traditional A-U and GC base 
pairs in mRNA, G can also base pair to 
some extent with U. 


FIGURE 6.19 The expression vector 
pKK233-2. The plasmid pKK233-2 
codes for the ampicillin resistance 
(Amp r ) gene as a selectable marker 
gene, the tac promoter (p tac ), the lacZ 
ribosome-binding site (rbs), three 
restriction endonuclease cloning sites 
(Ncol, PstI, and Hindlll), and two tran¬ 
scription termination sequences (T1 
and T2). The arrow indicates the direc¬ 
tion of transcription. The plasmid is not 
drawn to scale. 



which the mRNA can bind to the appropriate sequence on the ribosome 
and initiate translation. Thus, for each cloned gene, it is important to estab¬ 
lish that the ribosome-binding site is properly placed and that the sec¬ 
ondary structure of the mRNA does not prevent its access to the 
ribosome. 

A number of convenient vector systems that incorporate both tran¬ 
scriptional and translational signals for the expression of cloned eukaryotic 
genes in E. coli have been developed. One such system is called expression 
vector pKK233-2 and includes a number of elements, including the fol¬ 
lowing (Fig. 6.19): an ampicillin resistance gene as a selectable marker, the 
tac promoter, the lacZ ribosome-binding site, an ATG start codon located 8 
nucleotides downstream from the ribosome-binding site, and the transcrip¬ 
tion terminators T1 and T2 from bacteriophage /,. The cloned gene is 
inserted into an Ncol, PstI, or Hindlll site that lies between the ribosome¬ 
binding site and the transcription terminators so that it is in the same 
reading frame as the ATG start codon. After induction and transcription, 
the mRNA of a cloned gene is efficiently translated. However, since the 
nucleotide sequences that encode the amino acids in the N-terminal region 
of the target protein vary from one gene to another, it is not possible to 
design a vector that will eliminate the possibility of mRNA fold-back in all 
instances. Therefore, no single optimized translational initiation region can 
guarantee a high rate of translation initiation for all cloned genes. 
Consequently, the expression vectors described above are merely starting 
points for the optimization of translation initiation. 

A cellular incompatibility that can interfere with efficient translation 
occurs when a cloned gene has codons that are rarely used by the host cell. 
For example, AGG, AGA, AUA, CUA, and CGA are the least-used codons 
in E. coli. In these cases, the host cell may not produce enough of the 
transfer RNAs (tRNAs) that recognize these rarely used codons, and the 
yield of the cloned-gene protein is much lower than expected. An insuffi¬ 
cient supply of certain tRNAs may lead to either a reduction in the amount 
of protein synthesized or the incorporation of incorrect amino acids into 
the protein. Any codon that is used less than 5 to 10% of the time by the 
host organism may cause problems. Particularly detrimental to high levels 
of expression are places where two or more rarely used codons are close 
or adjacent or appear in the N-terminal portion of the protein. There are 
several experimental approaches that can be used to alleviate this problem. 
(1) If the target gene is eukaryotic, it may be cloned and expressed in a 
eukaryotic host cell. (2) Anew version of the target gene containing codons 
that are more commonly used by the host cell may be chemically synthe¬ 
sized (codon optimization). (3) A host cell that has been engineered to 
overexpress several rare tRNAs may be employed (Fig. 6.20). In fact, an £. 
coli strain that overproduces the tRNAs argU, ileY, and lettW, which are 
specific for the codons AGG/AGA, AUA, and CUA, respectively, is avail¬ 
able commercially. This cell line is sold for the explicit purpose of 
expressing a high level of foreign proteins that use these rare £. coli 
codons. With this commercially available £. coli cell line, it was possible to 
overexpress the Ara h2 protein, a peanut allergen, approximately 100-fold 
over the amount that was synthesized in conventional £. coli cells. Using 
the approach described, it should be possible to produce large quantities 
of a variety of heterologous proteins that are otherwise difficult to express 
in £. coli hosts (Table 6.3). 
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FIGURE 6.20 Schematic representation of the expression of foreign proteins in a 
typical E. coli host cell (A) and in an E. coli host cell that has been engineered to 
overexpress several rare tRNAs (B). 


Increasing Protein Stability 

In recent years, it has become almost routine to overproduce a wide range 
of different foreign proteins in E. coli or other host organisms. However, to 
recover large amounts of purified active protein requires that the protein be 
as stable as possible. In this regard, there are certainly large differences in 
the intrinsic stabilities of different proteins. In addition, a protein produced 
in a heterologous host cell may, for various reasons, be less stable than the 
same protein produced in its normal cellular environment. 

Intrinsic Protein Stability 

Under normal growing conditions, the half-lives of different proteins range 
from a few minutes to hours. The basis for this differential stability is both 
the extent of disulfide bond formation and the presence of certain amino 
acids at the N terminus. For example, when specific amino acids were 
attached to the N terminus of (3-galactosidase, the in vitro survival time of 
the protein varied from approximately 2 minutes to more than 20 hours 
(Table 6.4). Amino acid additions that extend the intrinsic survival of a 
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TABLE 6.3 Increases in gene expression that result from altering the 
codon usage of the wild-type gene (or cDNA) to more closely cor¬ 
respond to the host E. coli cell 


Protein 

Improvement (fold) 

Human interleukin-2 

16 

Clostridium tetani tetanus toxin fragment C 

4 

Human cardiac troponin T 

10-40 

Mouse c-FOS protein 

>200 

Spinach plastocyanin 

1.2 

Human neurofibromin 

3 

Human glutathione transferase M2-2 

140 

Human phosphatidylcholine transfer protein 

>100 

Human interleukin-6 

3 

Human interleukin-18 

5 

Plasmodium vaccine candidate antigen 

4 


In some cases, only a small number of codons were altered, while in others, the entire 
gene was synthesized with the optimal codon usage for expression in E. coli. 


protein can be readily incorporated into cloned genes. Often the presence 
of a single extra amino acid at the N-terminal end is sufficient to stabilize a 
target protein, probably by making it less susceptible to degradation by 
certain cellular proteases. Long-lived proteins can accumulate in cells and 
thereby increase the yield of the product. This phenomenon occurs in both 
prokaryotes and eukaryotes. 

In contrast to the amino acids at the N terminus, which can increase the 
stability of a protein, there are internal amino acid sequences that make a 
protein more susceptible to proteolytic degradation. These regions of the 
protein, which are called PEST sequences, are rich in proline (P), glutamic 
acid (E), serine (S), and threonine (T). They are often, but not always, 
flanked by clusters of positively charged amino acids, such as lysine and 
arginine, and may act to mark proteins for degradation within the cell. In 
some instances, it is possible to enhance the stability of a protein by altering 
its PEST regions by genetic manipulation. Such changes, of course, must 
not alter the function of the target protein. 


TABLE 6.4 Stability of (3-galactosidease with certain 
amino acids added to its N terminus 


Amino acid added 

Half-life 

Met, Ser, Ala 

>20 h 

Thr, Val, Gly 

>20 h 

He, Glu 

>30 min 

Tyr, Gin 

~10 min 

Pro 

~7 min 

Phe, Leu, Asp, Lys 

~3 min 

Arg 

~2 min 


Adapted from Bachmair et al.. Science 234:179-186,1986. 
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FIGURE 6.21 Regulation of the synthesis of a thioredoxin-target protein fusion in the 
absence (A) or presence (B) of tryptophan in the growth medium. The arrows 
labeled p trp and p L indicate the direction of transcription. o lrp , the operator region 
where the trp repressor protein binds; o L , the operator region where the cl repressor 
binds; p trp , the trp promoter; p L , the leftward promoter from bacteriophage A; TT, 
transcription termination region. The box between the thioredoxin and target genes 
indicates the DNA region that codes for the peptide that acts as the enterokinase 
cleavage site; the horseshoes indicate the binding of a repressor protein to its 
operator region. 


Facilitating Protein Folding 

Many of the proteins that are produced in E. coli accumulate in the form of 
insoluble, intracellular, biologically inactive inclusion bodies. Although a 
small amount of biologically active protein can often be recovered from 
inclusion bodies, the extraction procedure requires expensive and time- 
consuming protein solubilization and refolding procedures. Since the in 
vivo insolubility of proteins is often a consequence of their incorrect 
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folding, various strategies have been devised to avoid this problem. For 
example, fusion proteins that contain thioredoxin, an 11.7-kilodalton (kDa) 
protein, as the fusion partner remain soluble even when up to 40% of the 
cellular protein consists of the fusion protein. With this system, the target 
gene is cloned into a multiple cloning site just downstream from the thiore¬ 
doxin gene. Both of these genes are under the control of the p L promoter on 
an £. coli plasmid vector (Fig. 6.21). The £. coli host cell for this system has 
a cl repressor protein construct that consists of a copy of the gene for the cl 
repressor protein under the transcriptional control of the trp promoter inte¬ 
grated into the chromosomal DNA. In the absence of tryptophan (Fig. 
6.21A), a sufficient amount of cl repressor protein is synthesized to prevent 
transcription from the p L promoter and therefore prevent the production of 
the fusion protein. When tryptophan is added to the growth medium (Fig. 
6.21B), the trp promoter is turned off by the trp repressor, the cl repressor 
protein is not synthesized, and the DNA construct encoding the fusion 
protein is transcribed from the p L promoter on the plasmid and then trans¬ 
lated. In this case, the fusion protein, which consists of thioredoxin and the 
target protein, is synthesized and accumulates preferentially at osmotically 
sensitive sites called adhesion zones at the inner periphery of the host £. 
coli cytoplasmic membrane. The soluble fusion protein is selectively 
released by osmotic shock from £. coli cells into the growth medium. If 
required, the native form of the target protein can be released from the 
fusion protein by treatment with the enzyme enterokinase. Finally, since 
thioredoxin is stable at temperatures as high as 80°C, in those instances 
where the target protein is also stable at high temperatures, the fusion pro¬ 
tein may be purified by high-temperature incubation, a condition that 
denatures most of the other cellular proteins. 

In £. coli and other gram-negative bacteria, disulfide bond formation 
takes place in the periplasmic space, which is topologically equivalent to 
the endoplasmic reticulum in eukaryotes but is a much more oxidizing 
environment. Disulfide bond formation in £. coli requires the participation 
of two soluble periplasmic enzymes (DsbA and DsbC) and two membrane- 
bound enzymes (DsbB and DsbD). Unfortunately, foreign proteins that 
contain three or more disulfides generally do not fold correctly in bacteria 
and often form inclusion bodies. In these instances, the foreign protein is 
usually produced in animal or plant cells. However, researchers have 
devised some strategies to overcome this problem for recalcitrant proteins 
with a large number of disulfide bonds. 

Human tissue plasminogen activator is an important therapeutic pro¬ 
tein that is used to remove blood clots. The active form of this protein is a 
527-amino-acid serine protease that folds into five distinct structural 
domains with 35 cysteine residues that participate in the formation of 17 
disulfide bonds. The cDNA for human tissue plasminogen activator was 
cloned downstream of a DNA sequence that encodes a leader peptide that 
was previously used to facilitate the secretion (to the periplasm) of other 
eukaryotic proteins. However, in this case, only trace amounts of human 
tissue plasminogen activator were produced. This low level of activity is 
thought to reflect the fact that this complex protein is unable to fold prop¬ 
erly in E. coli. In an attempt to increase the yield of human tissue plasmi¬ 
nogen activator, the gene was coexpressed with the gene for either rat or 
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yeast protein disulfide isomerase, both of which had been found to assist 
protein folding in other cases; however, this strategy did not affect the 
amount of active human tissue plasminogen activator that could be 
obtained. On the other hand, the coexpression of high levels of DsbC (Fig. 
6.22) resulted in more than a 100-fold increase in the production of func¬ 
tional human tissue plasminogen activator. Alterations in the levels of the 
other Dsb proteins did not affect the amount of active human tissue plas¬ 
minogen activator that could be recovered. To realize the maximum benefit 
from DsbC overproduction, it was necessary to induce the synthesis of this 
protein approximately 30 minutes prior to the induction of human tissue 
plasminogen activator. 

Similar research in another laboratory found that while DsbC overpro¬ 
duction had the largest impact on promoting the correct folding of horse¬ 
radish peroxidase in £. coli and preventing its aggregation, the simultaneous 
overproduction of all four Dsb proteins yielded the greatest amount of 
properly folded and active horseradish peroxidase. Thus, the experience of 
several researchers clearly demonstrates that the correct folding and disul¬ 
fide bond formation of eukaryotic proteins produced in E. coli require high 
levels of the Dsb proteins or possibly other protein disulfide isomerases. 


Coexpression Strategies 

The expression of some thermolabile foreign proteins in E. coli host strains, 
which are typically grown at 37°C, often results in the formation of inclu¬ 
sion bodies of inactive protein. This occurs because the foreign protein 
misfolds when it cannot attain its native active conformation. A variety of 
strategies have been developed, albeit with limited success, to circumvent 
this problem. Cultivation of recombinant strains at low temperatures, 
which is beneficial to proper protein folding, often significantly increases 
the amount of recoverable active protein. However, mesophilic bacteria 
like E. coli grow extremely slowly at low temperatures. In one study, the 
chaperonin 60 gene (cpn60) and the cochaperonin 10 gene (cpnlO) from the 
psychrophilic bacterium Oleispim antarctica were introduced into a host 
strain of £. coli with the result that the £. coli strain gained the ability to 
grow at a high rate at low temperatures (4 to 10°C). This strain was subse¬ 
quently transformed with a plasmid encoding a temperature-sensitive 
esterase. The expression of the temperature-sensitive esterase in the E. coli 
strain carrying the two chaperone genes at 4 to 10°C yielded esterase spe¬ 
cific activity that was 180-fold higher than the activity from the native E. 
coli strain (without chaperonins) grown at 37°C. Interestingly, and con¬ 
trary to expectations, the psychrophile chaperonins do not facilitate the 
proper folding of the esterase and do not affect its catalytic properties. 
Although very high levels of expression of the cloned esterase were not 
attained, this work is an important first step in the development of expres¬ 
sion systems for proteins that are sensitive to high temperature and might 
otherwise be difficult to produce. The next logical step in the development 
of this system would likely be the construction of an £. coli host cell that 
contains stably integrated copies of these chaperonin genes in the chromo¬ 
some. 
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FIGURE 6.22 Schematic representation 
of the effect of overproduction of the E. 
coli disulfide bond-forming protein 
(DsbC) on the synthesis of active 
human tissue plasminogen activator 
(tPA). (A) When too few molecules of 
DsbC are available, the tPA is folded 
incorrectly and is inactive. (B) Over¬ 
production of DsbC results in correctly 
folded and active human tPA. In both 
instances, the tPA is secreted into the 
periplasm. 
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Overcoming Oxygen Limitation 

£. coli and most other microorganisms that are used to express foreign pro¬ 
teins generally require oxygen for optimal growth. Unfortunately, oxygen 
has only limited solubility in aqueous media. Thus, as the cell density of a 
growing culture increases, the cells rapidly deplete the growth medium of 
dissolved oxygen. When cells become oxygen limited, exponential growth 
slows and the culture rapidly enters a stationary phase, during which cel¬ 
lular metabolism changes. One consequence of the stationary phase is the 
production by the host cells of proteases that can degrade foreign proteins. 
Oxygen dissolves into the growth medium very slowly, so this problem is 
not always alleviated when large amounts of air or oxygen are added to the 
growth medium, even with high stirring rates. Modification of the fer¬ 
menter configuration to optimize the aeration and agitation of cells, and 
addition of chemicals to the growth medium to increase the solubility of 
oxygen have been tried in an effort to deal with the limited amount of dis¬ 
solved oxygen. However, these efforts have met with only limited success. 

Use of Protease-Deficient Host Strains 

One possible way to stabilize foreign proteins produced in £. coli is to 
develop host strains that are deficient in the production of proteolytic 
enzymes. However, this is not as simple as it might appear. £. coli has at 
least 25 different proteases, and only a few of them have been studied at the 
genetic level. Moreover, these proteases are important for the degradation 
of abnormal or defective proteins, which is a housekeeping function that is 
necessary for the continued viability of the cells. In one study, strains with 
mutations in one or more protease genes were constructed. Generally, the 
strains that were most deficient in overall protease activity grew most 
slowly. Thus, decreasing protease activity caused cells to be debilitated. 
However, an £. coli strain with mutations in both the gene for the RNA 
polymerase sigma factor that is responsible for heat shock protein synthesis 
(rpoH) and the gene for a protease that is required for cell growth at high 
temperatures (degP) secreted target proteins that had a 36-fold-greater spe¬ 
cific activity than when they were produced in wild-type host cells. This 
increase in activity reflects a decrease in the proteolytic degradation of 
these secreted proteins. 

Bacterial Hemoglobin 

Some strains of the Vitreoscilla bacterium, a gram-negative obligate aerobe, 
normally live in oxygen-poor environments, such as stagnant ponds. To 
obtain a sufficient amount of oxygen for their growth and metabolism, 
these organisms synthesize a hemoglobin-like molecule that binds oxygen 
from the environment and increases the level of available oxygen inside 
cells. When the gene for this protein was cloned and expressed in £. coli, the 
transformants displayed higher levels of synthesis of both cellular and 
recombinant proteins, higher levels of cellular respiration, a higher ATP 
production rate, and higher ATP contents, especially at low levels of 
oxygen in the growth medium, than did nontransformed cells. In these 
transformants, the Vitreoscilla hemoglobin increases the intracellular effec¬ 
tive oxygen concentration, which raises the activities of both cytochrome d 
and cytochrome o. This causes an increase in proton pumping, with the 
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subsequent generation of ATP, thereby providing additional energy for cel¬ 
lular metabolic processes (Fig. 6.23). For this strategy to be effective in dif¬ 
ferent host cells, not only must the Vitreoscilla sp. hemoglobin gene be 
efficiently expressed, but the host cells must also be able to synthesize the 
heme portion of the hemoglobin molecule. Once these conditions have 
been met, this strategy can be used to improve growth, as well as foreign- 
gene expression, in a range of different industrially important bacteria, 
including £. coli, Streptomyces spp., Enterobacter aerogenes, and Xanthomonas 
maltophilia (Table 6.5). 


Limiting Biofilm Formation 

In many natural environments, bacteria are commonly found to be associ¬ 
ated with solid surfaces and only rarely exist as free-swimming entities. 
The bacterial cells typically attach to a surface, form a monolayer, and later 
organize into a biofilm, a mixture of bacterial cells and polysaccharides 
(typically alginate) that may be as much as 100 to 200 pm thick (Fig. 6.24). 
When they are part of a biofilm, bacterial cells are generally protected 
against hostile agents in the environment, such as biocides, bacteriophages, 
and protozoa. 

Bacterial cells that form biofilms, or otherwise produce significant 
amounts of extracellular polysaccharide, are difficult to transform with 
plasmid DNA and are typically resistant to high levels of antibiotics (which 
find it difficult to enter the bacterial cells). These cells are limited in the 
amount of foreign protein that they can produce, probably reflecting the 
large amount of cellular resources directed toward polysaccharide synthesis 
and the limited amounts of nutrients that are able to enter cells within the 
biofilm. 


FIGURE 6.23 Schematic representation of the binding of 0 2 by Vitreoscilla hemo¬ 
globin, the utilization of this 0 2 in pumping (by proteins such as cytochromes) H + 
from the cytoplasm to the periplasm, and the subsequent coupling of H + uptake (by 
ATPase) to ATP generation. ADP, adenosine diphosphate. 
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TABLE 6.5 Use of Vitreoscilla hemoglobin for improved production of foreign proteins in bacteria 


Type of improvement 

Specific effect 

Bacterium 

Increased protein production 

Growth and a-amylase production 

E. coli 


Total protein content 

E. coli 


Protein secretion and a-amylase production 

B. subtilis 

Increased chemical production 

Growth and biosurfactant production 

Gordonia amarae 


Acetoin and butanol production 

E. aerogenes 


Growth and poly-(3-hydroxybutyrate production 

E. coli 

Increased antibiotic production 

Cephalosporin C 

Acremonium chrysogemim 


Actinorhodin 

Streptomyces coelicolor 


Erythromycin 

Saccharopolyspora erythraea 


Growth and monensin production 

Streptomyces cinnamonensis 


Chlorotetracycline 

Streptomyces aureofaciens 

Enhanced bioremediation 

Growth and degradation of 2,4-dinitrotoluene 

P. aeruginosa 

Burkholderia sp. 


Degradation of benzoic acid 

X. maltophilia 


Growth and degradation of benzoic acid 

Burkholderia sp. 


Degradation of 2-chlorobenzoate 

Burkholderia cepacia 

Enhanced physiology 

Growth 

Tremella fuciformis 


Copper uptake 

E. coli 


Growth 

E. aerogenes 


Resistance to sodium nitroprusside 

E. coli 


Adapted from Zhang et al., Biotechnol. Adv. 25:123-136, 2007. 


FIGURE 6.24 E. coli cells in their plank¬ 
tonic form in solution and not inter¬ 
acting with any surface (A), as a 
monolayer attached to a solid surface 
(B), and as part of a three-dimensional 
biofilm attached to a solid surface (C). 
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Workers have previously tried to limit the formation of biofilms either 
by adding biofilm-inhibiting chemical agents to the bacterial growth 
medium or by utilizing new materials to which bacteria adhere less effi¬ 
ciently. However, these approaches have met with only limited success. 
Another way to limit biofilm formation is to genetically engineer bacterial 
strains that are deficient in forming biofilms. This was done by creating an 
E. coli strain in which the genes that were involved in the biosynthesis of 
curb, colanic acid, and bacterial pili were deleted. Pili are required for ini¬ 
tial attachment of a bacterial cell to a solid surface, curb are needed for 
cell-cell and cell-surface attachment, and colanic acid contributes to the 
three-dimensional structure of the biofilm. When genes involved in these 
three functions were deleted, the resultant strain of E. coli was unable to 
form biofilms, displayed an increased sensitivity to antibiotics, transformed 
with a much greater efficiency, and was able to produce a higher level of 
recombinant protein (Table 6.6). This result indicates that is possible to 
engineer an £. coli host cell to be more efficient at producing foreign pro¬ 
teins. In addition to producing higher levels of recombinant proteins, the 
use of a bacterial host strain that displays substantially increased sensi¬ 
tivity to common antibiotics may obviate the overuse of antibiotics in the 
production of foreign proteins. This should make it easier to ensure that the 
final purified protein product is free of any contaminating antibiotics. 


DNA Integration into the Host Chromosome 

A plasmid imposes a metabolic load on the cell because of the energy that 
is used for its replication and for the transcription of RNA and translation 
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TABLE 6.6 Performance and behavior of wild-type compared to biofilm-minus E. coli 


Strain 

% Adherence to a 
polystyrene 
surface 

Sensitivity to 
streptomycin 

Sensitivity to 
rifampin 

Transformation 
efficiency (no. of 
transformants/p.g 
of plasmid DNA) 

Intracellular 
endoxylanase 
activity (U/mL) 

Extracellular lipase 
activity (U/mL) 

Wild type 

30 

15 

20 

3.4 x 10 5 

6.4 

7.9 

Biofilm-minus 

mutant 

3 

1 

1 

7.9 x 10 6 

7.3 

10.0 


Antibiotic sensitivity is reported as the relative amount of antibiotic needed to obtain the same amount of bacterial killing. Foreign endoxylanase and lipase 
genes were introduced on plasmids. 


of the proteins that it encodes. For the most part, high-copy-number plas¬ 
mids impose a greater metabolic load than do low-copy-number plasmids. 
As a consequence, a fraction of the cell population often loses its plasmids 
during cell growth. Also, cells that lack plasmids generally grow faster than 
those that retain them, so plasmidless cells eventually dominate the cul¬ 
ture. After a number of generations of cell growth, the loss of plasmid- 
containing cells diminishes the yield of the cloned gene product. 
Investigators have devised at least two methods of combating this loss. On 
a laboratory scale, plasmid-containing cells are maintained by growing the 
cells in the presence of either an antibiotic or an essential metabolite that 
enables only plasmid-bearing cells to thrive. But the addition of either anti¬ 
biotics or metabolites to pilot plant- or industrial-scale fermentations can 
be extremely costly, and it is imperative that anything that is added to the 
fermentation, such as an antibiotic or a metabolite, be completely removed 
before the product is certified as fit for human use. Moreover, for geneti¬ 
cally engineered microorganisms that are designed to be released into the 
environment to remain both effective and environmentally safe, it is essen¬ 
tial that the cloned DNA be retained and be neither easily lost nor trans¬ 
ferred to other microorganisms. For these reasons, the introduction of 
cloned DNA directly into the chromosomal DNA of the host organism can 
overcome the problem of plasmid loss. When DNA is part of the host chro¬ 
mosomal DNA, it is relatively stable and consequently can be maintained 
for many generations in the absence of selective agents. 

The chromosomal integration site of a cloned gene must not be within 
an essential coding gene. Consequently, the input DNA sequence must be 
targeted to a specific nonessential site within the chromosome. In addition, 
to ensure efficient production of the target protein, the input gene should 
be under the control of a regulatable promoter. 

For integration of DNA into a chromosomal site, the input DNA must 
share some sequence similarity, usually at least 50 nucleotides, with the 
chromosomal DNA, and there must be a physical exchange (recombina¬ 
tion) between the two DNA molecules. Briefly a generalized protocol for 
DNA integration includes the following steps. 

1. Identify the desired chromosomal integration site, i.e., a segment of 
DNA on the host chromosome that can be disrupted without 
affecting the normal functions of the cell. 

2. Isolate and clone part or all of the chromosomal integration site. 

3. Ligate a cloned gene and a regulatable promoter either into (Fig. 
6.25A) or adjacent to (Fig. 6.25B) the cloned chromosomal integra¬ 
tion site. 
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4. Transfer the chromosomal integration fragment-cloned-gene con¬ 
struct into the host cell as part of a plasmid that cannot replicate in 
the host cell. 

5. Select and perpetuate host cells that express the cloned gene. 
Propagation of the cloned gene can occur only if it has been inte¬ 
grated into the chromosomal DNA of the host cell. 

When a host cell is transformed with a nonreplicating plasmid that car¬ 
ries the cloned gene in the middle of a portion of the cloned chromosomal 
integration site, the DNA on the plasmid can base pair with identical 
sequences on the host chromosome (Fig. 6.25A). The integration occurs as 
a result of a host enzyme-catalyzed double crossover. Alternatively, a single 
crossover that incorporates the entire input plasmid into the host chromo¬ 
some may occur (not shown). Similarly, insertion of the entire input 
plasmid DNA occurs when the cloned gene is inserted next to the cloned 
chromosomal integration site (Fig. 6.25B). 

The effectiveness of integration of a cloned gene was examined in B. 
subtilis. The investigators constructed an E. coli plasmid that contained an 
a-amylase gene from Bacillus amyloliquefaciens that had been inserted into 
the middle of a chromosomal DNA fragment from B. subtilis but could not 
replicate in B. subtilis. This construct, however, could transform B. subtilis. 
Transformants expressing a-amylase, an enzyme involved in the hydro¬ 
lysis of starch, were recovered, indicating that the a-amylase gene had been 
integrated into the B. subtilis chromosome and was functioning. The 
selected transformants were resistant to ampicillin and chloramphenicol. 
Because both of the antibiotic resistance genes were on the input plasmid, 
this result indicated that a single recombination event must have occurred 
and caused the integration of the entire plasmid into the B. subtilis chromo¬ 
somal DNA. 

To increase the number of copies of the a-amylase gene that were 
present on the B. subtilis chromosome, the original transformants were 
grown in the presence of high levels of chloramphenicol. Only cells in 
which spontaneous DNA duplication of the integrated plasmid had 
occurred could survive under these conditions. The cells selected for a high 
level of chloramphenicol resistance were then assayed for a-amylase 
activity (Table 6.7). With this selection procedure, cells with up to nine 
copies of the a-amylase gene were identified. The level of enzyme activity 
expressed from the chromosomally integrated genes far exceeded the 
activity levels that occurred when the a-amylase gene was present on a 
multicopy (about 20 to 40 copies per cell) B. subtilis plasmid, probably 
reflecting either the instability of the multicopy plasmid or the energy 
drain on the transformed cells imposed by the synthesis of the chloram¬ 
phenicol resistance gene and a-amylase. 

In one study, several copies of a foreign gene were inserted into dif¬ 
ferent predetermined sites on the B. subtilis chromosome. To do this, a 
two-step procedure was used for each copy of the foreign gene to be 
inserted (Fig. 6.26). First, a selectable marker gene, such as an antibiotic 
resistance gene, was inserted into the middle of a defined but nonessential 
piece of B. subtilis chromosomal DNA on a plasmid vector that could not 
replicate in B. subtilis (Fig. 6.26, step 1). Following transformation of B. 
subtilis with this construct, cells expressing the marker gene were selected. 
These transformants carried the selectable marker gene integrated at the 
specified site in the B. subtilis chromosomal DNA. Second, the target gene 
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FIGURE 6.25 Integration of a cloned gene into a chromosomal site. (A) The cloned 
gene has been inserted, on a plasmid, in the middle of a cloned segment of DNA 
(ab) from the host chromosome. Homologous DNA pairing occurs between 
plasmid-borne DNA regions a and b and host chromosome DNA regions a' and b', 
respectively. A double-crossover event (x—x) results in the integration of the cloned 
gene. (B) The cloned gene is inserted adjacent to the cloned DNA from the host 
chromosome (c). Homologous DNA pairing occurs between plasmid DNA region c 
and host chromosome DNA region c'. A single recombination event (x) within the 
paired c-c' DNA region results in the integration of the entire plasmid, including 
the cloned gene. 


with its transcriptional and translational control sequences was inserted 
into the middle of the piece of B. subtilis chromosomal DNA that was used 
with the marker gene and introduced into the cell on a nonreplicatable 
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TABLE 6.7 a-Amylase gene copy number and acitvity in B. subtilis 


No. of copies/genome 

Activity (U/mL of mid-log-phase cells) 

2 

500 

5 

2,300 

7 

3,100 

8 

3,400 

9 

4,400 

Multicopy plasmid 

700 


Adapted from Kallio et al., Appl. Micrtobiol. Biotechnol. 27:64-71,1987. 


plasmid (Fig. 6.26, step 2). Following transformation of B. subtilis with the 
target gene-plasmid construct, cells that no longer expressed the marker 
gene were selected after replica plating. These transformants carried the 
target gene and not the marker gene integrated into the B. subtilis chromo¬ 
somal DNA. To integrate additional copies of the target gene into the host 


FIGURE 6.26 Insertion of a foreign gene into a unique predetermined site on the B. 
subtilis chromosome. In step 1, a marker gene is integrated into the host cell chro¬ 
mosomal DNA by homologous recombination. In step 2, the selectable marker gene 
is replaced by the target gene. The process may then be repeated with different 
nonessential regions of B. subtilis chromosomal DNA. 
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cell chromosomal DNA, the entire procedure was repeated several times 
with different defined but nonessential regions of B. sub tills chromosomal 
DNA. 

Removing Selectable Marker Genes 

The integration of a selectable marker gene along with a gene of interest is 
helpful for identifying transformed cells under laboratory conditions. 
However, the presence of a selectable marker gene for antibiotic resistance 
in a genetically modified organism that is released into the environment is 
not desirable. For organisms that are intended for deliberate release into 
the environment (e.g., to degrade pollutants), it is desirable to avoid the 
concomitant proliferation of antibiotic resistance genes in the environment. 
To ensure that only the cloned gene is integrated into the host chromosome, 
the cloned gene may be inserted into a DNA fragment (on a plasmid) that 
is identical to DNA on the host chromosome. Moreover, the input DNA 
sequence must be targeted to a specific nonessential site within the chro¬ 
mosome or important host functions will be disrupted. However, when no 
nonessential host genes have been identified, a cloned gene may be intro¬ 
duced into the host chromosome by means of a single crossover that incor¬ 
porates the entire input plasmid into the host chromosome. This occurs 
when the cloned gene is inserted (on a plasmid) next to the cloned chromo¬ 
somal integration site. In this case, any selectable markers on the plasmid 
(including antibiotic resistance genes) will be inserted into the host chro¬ 
mosomal DNA. To avoid the problems associated with these approaches, 
several groups of workers have developed "insertion-removal" systems 
for the selective removal of marker genes from host cell chromosomes. An 
overview of one of these systems is described below. 

When a marker gene is flanked by certain short specific DNA sequences 
and then inserted into either a plasmid or chromosomal DNA, the gene 
may be excised by treatment of the construct with an enzyme that recog¬ 
nizes the flanking DNA sequences and removes them (Fig. 6.27). One com¬ 
bination of an enzyme and DNA sequence that is useful for this sort of 
manipulation is the Cr e-loxP recombination system, which consists of the 
Cre recombinase enzyme and two 34-bp loxP recombination sites. The 
marker gene to be removed is flanked by loxP sites, and after integration of 
the plasmid into the chromosomal DNA, the marker gene is removed by 
the Cre enzyme. A gene encoding the Cre enzyme is located on its own 
plasmid, which can be introduced into the chromosomally transformed 
host cells. Marker gene excision is triggered by the addition of IPTG to the 
growth medium; this derepresses the lacl gene (encoding the lac repressor), 
which turns on the E. coli lac promoter-operator, which was present 
upstream of the Cre gene, and causes the Cre enzyme to be synthesized. 
Once there is no longer any need for the Cre enzyme, the plasmid that 
contains the gene for this enzyme under the control of the lac promoter may 
be removed from the host cells merely by raising the temperature. This 
plasmid has a temperature-sensitive replicon that allows it to be main¬ 
tained in the cell at 30°C but not above 37°C. 

Although protocols for the excision of marker genes have not as yet 
been widely implemented, most of the details have been worked out. 
Given the unease that exists among certain segments of the public in var¬ 
ious locales regarding the deliberate release of genetically engineered 
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FIGURE 6.27 Removal of a selectable marker gene following integration of plasmid 
DNA into a bacterial chromosome. A single crossover event (x) occurs between 
chromosomal DNA and a homologous DNA fragment (hatched) on a plasmid, 
resulting in the integration of the entire plasmid into the chromosomal DNA. The 
selectable marker gene, which is flanked by loxP sites, is excised by the action of the 
Cre enzyme, leaving one loxP site on the integrated plasmid. The Cre enzyme is on 
a separate plasmid within the same cell under the transcriptional control of the E. 
coli lac promoter so that excision is induced when IPTG is added to the growth 
medium. 


bacteria into the environment, it is essential that the organisms that are 
released be as benign as possible; removing antibiotic resistance genes is 
an important step in that direction. 


Increasing Secretion 

The process of secretion of proteins in E. coli entails exit through the inner 
(cytoplasmic) cell membrane to the periplasm for many proteins and pas¬ 
sage through the outer membrane for a few proteins. Directing a foreign 
protein to the periplasm or the growth medium makes its purification 
easier and less costly, as many fewer proteins are present there than in the 
cytoplasm. Moreover, the stability of a cloned protein depends on its cel¬ 
lular location in E. coli. For example, recombinant proinsulin is approxi¬ 
mately 10 times more stable if it is secreted (exported) into the periplasm 
than if it is localized in the cytoplasm. In addition, secretion of proteins to 
the periplasm facilitates the correct formation of disulfide bonds because 
the periplasm provides an oxidative environment, in contrast to the more 
reducing environment of the cytoplasm. Table 6.8 indicates the amounts of 
secreted recombinant pharmaceutical protein attainable in various bacte¬ 
rial systems. 
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TABLE 6.8 Yields of several secreted recombinant proteins produced in different bacteria 


Protein 

Yield (per liter) 

Host bacterium 

Hirudin 

>3g 

E. coli 

Human antibody fragment 

1-2 g 

E. coli 

Human insulin-like growth factor 

8.5 g 

E. coli 

Monoclonal antibody 5T4 

700 mg 

E. coli 

Humanized anti-CD18 F(ab') 2 

2.5 g 

E. coli 

Human epidermal growth factor 

325 mg 

E. coli 

Alkaline phosphatase 

5.2 g 

E. coli 

Staphylokinase 

340 mg 

B. subtilis 

Human proinsulin 

1 8 

B. subtilis 

Human calcitonin precursor 

2g 

Staphylococcus carnosus 

Organophosphohydrolase 

1-2 g 

Ralstonia eutropha 

Human CD4 receptor 

200 mg 

Streptomyces lividans 

Human insulin 

100 mg 

Streptoviyces lividans 


Secretion into the Periplasm 

Normally, an amino acid sequence called the signal peptide (also called the 
signal sequence, or leader peptide), located at the N-terminal end of a pro¬ 
tein, facilitates its export by enabling the protein to pass through the cell 
membrane (Fig. 6.28). It is sometimes possible to engineer a protein for 
secretion to the periplasm by adding the DNA sequence encoding a signal 
peptide to the cloned gene. When the recombinant protein is secreted into 
the periplasm, the signal peptide is precisely removed by the cell's secre¬ 
tion apparatus so that the N-terminal end of the target protein is identical 
to that of the natural protein. 

However, the presence of a signal peptide sequence does not neces¬ 
sarily guarantee a high rate of secretion. When the fusion of a target gene 
to a DNA fragment encoding a signal peptide is ineffective in producing a 
secreted protein product, alternative strategies need to be employed. One 
approach that was found to be successful for the secretion of the protein 
interleukin-2 was the fusion of the interleukin-2 gene downstream from the 
gene for the entire propeptide maltose-binding protein, rather than just the 
maltose-binding protein signal sequence, with DNA encoding the factor X a 
recognition site as a linker peptide separating the two genes (Fig. 6.29). 
When this genetic fusion, on a plasmid vector, was used to transform E. coli 
cells, as expected, a large fraction of the fusion protein was found to be 
localized in the host cell periplasm. Functional interleukin-2 could then be 
released from the fusion protein by digestion with factor X a . 

In many instances, when foreign proteins engineered for secretion are 
overproduced in E. coli, the precursor form is only partially processed, with 
about half of the secreted proteins retaining the leader peptide and the other 
half being fully processed to the mature form. This is probably the result of 
overloading some of the components involved in the secretion process. If 
this is the case, then it might be possible to increase the ratio of processed to 
unprocessed proteins by increasing the level of expression of some of the 
limiting components of the protein secretion pathway. This hypothesis was 
tested in a series of experiments in which a plasmid containing both the 
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FIGURE 6.28 Schematic representation 
of protein secretion. The ribosome is 
attached to a cellular membrane, and 
the signal peptide (signal sequence; 
leader peptide) at the N terminus is 
transported, by the secretion apparatus, 
across the cytoplasmic membrane, fol¬ 
lowed by the rest of the amino acids 
that constitute the mature protein or 
peptide. Once the signal peptide has 
crossed the membrane, it is cleaved by 
an enzyme, called a signal peptidase, 
associated with the membrane. 
Membrane proteins, as well as secreted 
proteins, generally contain a signal 
peptide (prior to its removal by pro¬ 
cessing). 


prlA4 and secE genes, which encode major components of the molecular 
apparatus that physically moves proteins across the membrane, was intro¬ 
duced into E. coli host cells. Following this augmentation of the host cell 
secretory machinery, the fraction of the cloned protein (in this case, the 
cytokine interleukin-6) that was secreted to the periplasm as the processed 
form with the signal peptide removed increased from about 50% to more 
than 90%. 

In many cases, secretion of heterologous proteins in E. coli is dependent 
on the translational level of the protein. The foreign proteins that were 
translated most efficiently were not necessarily secreted to the greatest 
extent. Sometimes too high a level of translation of a foreign protein can 
overload the cell's secretion machinery and inhibit the secretion of that 
protein. Thus, one way to ensure that secretion of a target protein occurs 
most efficiently may be to lower the level of expression of that protein. 

Secretion into the Medium 

£. coli and other gram-negative microorganisms generally cannot secrete 
proteins into the surrounding medium because of the presence of an outer 
membrane that restricts this process. There are at least two solutions to this 
problem. The first is to use as host organisms gram-positive prokaryotes or 
eukaryotic cells, both of which lack an outer membrane and therefore can 


FIGURE 6.29 Engineering the secretion of interleukin-2. (A) Interleukin-2 fused to 
the E. coli maltose-binding protein signal peptide (MBP signal) is not secreted. (B) 
When interleukin-2 is fused to the E. coli maltose-binding protein and its signal 
peptide, with the two proteins joined by a linker peptide, secretion occurs. 
Subsequently, the maltose-binding protein and the linker peptide are removed by 
digestion with factor X a . 
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secrete proteins directly into the medium. The second solution entails the 
use of genetic manipulation to engineer gram-negative bacteria that can 
secrete proteins directly into the growth medium. 

In general, relatively few proteins pass through the outer membrane of 
E. coli. However, some gram-negative bacteria can secrete a bacteriocidal 
protein called a bacteriocin into the medium. A cascade mechanism is 
responsible for this specific secretion. A bacteriocin release protein activates 
phospholipase A, which is present in the bacterial inner membrane, and 
cleaves membrane phosopholipids so that both the inner and outer mem¬ 
branes are permeabilized. Some cytoplasmic and periplasmic proteins are 
released into the culture medium. Thus, by putting the bacteriocin release 
protein gene onto a plasmid under the control of a strong regulatable pro¬ 
moter, E. coli cells may be permeabilized at will. E. coli cells that carry the 
bacteriocin release protein gene are transformed with another plasmid car¬ 
rying a cloned gene that has been fused to a secretion signal peptide 
sequence. The cloned gene is placed under the same transcriptional-regu¬ 
latory control as the bacteriocin release protein gene so that the two genes 
can be induced simultaneously, with the target protein being secreted into 
the medium (Fig. 6.30). 

Although secretion of E. coli proteins to the growth medium is quite 
rare, the small protein YebF is naturally secreted to the medium without 
lysing the cells or permeabilizing the membranes. When various proteins 


FIGURE 6.30 E. coli cells engineered to secrete a foreign protein to the periplasm by 
fusing the gene of interest (green) to a secretion signal (A) and to the growth 
medium by permeabilizing cell membranes with a bacteriocin release protein 
encoded on another plasmid (red) (B). 


























are fused to the C-terminal end of YebF, following the removal of the signal 
peptide, the entire fusion construct is secreted to the medium. It is believed 
that the pre-YebF-foreign-protein fusion is first secreted across the £. coli 
cytoplasmic membrane to the periplasm with the concomitant removal of 
the signal peptide (Fig. 6.31). Next, the leaderless YebF-foreign-protein 
fusion is secreted from the periplasm to the medium by an unknown pro¬ 
cess that involves specific amino acids that are part of YebF that interact 
with an unidentified receptor on the E. coli outer membrane. To date, 
researchers have reported facilitating the secretion to the medium of 
human interleukin-2 (a 15-kDa hydrophobic protein), bacterial a-amylase 
(a 48-kDa hydrophilic protein), and alkaline phosphatase (94 kDa), demon¬ 
strating that a wide range of proteins may be secreted to the medium using 
this system. The next step in the development of this system will likely 
involve engineering a readily cleavable linker region between YebF and the 
protein of interest so that the protein of interest can be recovered in its 
native form. In addition, this approach should make it easier to avoid con¬ 
taminating purified recombinant proteins with E. coli lipopolysaccharide, 
which is a pyrogenic toxin that may be released upon lysis of the bacterial 
outer membrane and is therefore a serious concern when the proteins are 
produced for use as therapeutic agents. 

YebF is not the only £. coli protein that is secreted to the growth 
medium. The flagellar type III secretion apparatus secretes the protein 
flagellin to the growth medium. Thus, it is possible to fuse a 173-bp 
untranslated DNA fragment upstream of the gene encoding flagellin, as 
well as a transcriptional terminator from the same gene, to a gene encoding 
a protein of interest, with the result that the protein of interest is efficiently 
secreted into the medium. Flowever, this system is not applicable to all 
proteins. 


FIGURE 6.31 Secretion, following expression in E. coli, of YebF-interleukin-2 fusion 
protein into the growth medium. The protein synthesized in the cytoplasm includes 
a signal peptide (yellow) that is excised when the fusion protein is secreted to the 
periplasm. The YebF-interleukin-2 fusion protein is then secreted from the 
periplasm, across the outer membrane, to the growth medium. YebF is shown in 
blue and interleukin-2 in green. 
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Metabolic Load 

The introduction and expression of foreign DNA in a host organism often 
change the metabolism of the organism in ways that may impair normal 
cellular functioning (Fig. 6.32). This phenomenon, which is a multifaceted 
biological response, is due to a metabolic load (metabolic burden; meta¬ 
bolic drain) that is imposed upon the host by the foreign DNA. A metabolic 
load can occur as the result of a variety of conditions, including the fol¬ 
lowing. 

• An increasing plasmid copy number and /or size requires increasing 
amounts of cellular energy for plasmid replication and maintenance 
(Table 6.9). 

• The limited amount of dissolved oxygen in the growth medium is 
often insufficient for both host cell metabolism and plasmid mainte¬ 
nance and expression. 

• Overproduction of both target and marker proteins may deplete the 
pools of certain aminoacyl-tRNAs (or even certain amino acids) 
and/or drain the host cell of its energy (in the form of ATP or 
GTP). 

• When a foreign protein is overexpressed and then exported from the 
cytoplasm to the cell membrane, the periplasm, or the external 
medium, it may "jam" export sites and thereby prevent the proper 
localization of other, essential host cell proteins. 

• Ftost cells with unusual metabolic features, such as a naturally high 
rate of respiration, e.g., Azotobacter spp., are more likely to be 
affected by these perturbations than are other host cells. 

• The foreign protein per se may interfere with the functioning of the 
host cell, for example, by converting an important and needed 
metabolic intermediate into a compound that is irrelevant, or even 
toxic, to the cell. 


FIGURE 6.32 Schematic representation of the biological consequences for a host cell 
of overexpressing a foreign protein and generating a metabolic load. When there is 
no metabolic load, the cell has access to sufficient energy and resources. The over¬ 
expression of a foreign protein—shown here as being analogous to the cell's 
wearing a heavy backpack—prevents the cell from obtaining sufficient energy and 
resources for its growth and metabolism so that it is less able to grow rapidly and 
attain a high density. 
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One of the most commonly observed consequences of a metabolic load 
is a decrease in the rate of cell growth after the introduction of foreign DNA 
(Table 6.9). Sometimes, a metabolic load may result in plasmid-containing 
cells losing a portion of the plasmid DNA. Even in the presence of selective 
pressure, all or part of the recombinant gene may be deleted from the 
plasmid. Since cells growing in the presence of a metabolic load generally 
have a decreased level of energy for cellular functions, energy-intensive 
metabolic processes, such as nitrogen fixation and protein synthesis, are 
invariably adversely affected by a metabolic load. A metabolic load may 
also lead to changes in the host cell size and shape and to increases in the 
amount of extracellular polysaccharide produced by the bacterial host cell. 
This extracellular carbohydrate causes the cells to stick together, making 
harvesting, e.g., by cross-flow microfiltration procedures, and protein puri¬ 
fication more difficult. 

Translational errors occur in growing E. coli cells at a rate of about 2 x 
10 4 to 2 x 10 ~ 3 errors per cell per generation. However, when a particular 
aminoacyl-tRNA becomes limiting, as is often the case when a foreign pro¬ 
tein is overexpressed in E. coli, there is an increased probability that an incor¬ 
rect amino acid will be inserted in place of the limiting amino acid. In 
addition, translational accuracy, which depends upon the availability of 
GTP as part of a proofreading mechanism, is likely to be further decreased 
as a consequence of a metabolic load from foreign-protein overexpression. 
For example, a high level of expression of mouse epidermal growth factor in 
E. coli caused about 10 times the normal amount of incorrect amino acids to 
be incorporated into the recombinant protein. This frequency of errors 
diminishes the usefulness of the protein as a therapeutic agent, since (1) the 
specific activity and stability of the target protein are significantly lowered 
and (2) the incorrect amino acids may cause the protein to be immunogenic 
in humans. 

A well-designed experiment can minimize the impact of the metabolic 
load, optimize the yield of the recombinant protein, and enhance the sta¬ 
bility of the transformed host cell. For example, the extent of the metabolic 
load can be reduced by using a low-copy-number rather than a high-copy- 
number plasmid vector. An even better strategy might be to avoid the use 
of plasmid vectors altogether and integrate the introduced foreign DNA 
directly into the chromosomal DNA of the host organism. In this case, 
plasmid instability will not be a problem. With an integrated cloned gene, 
without the plasmid vector, the transformed host cell will not waste its 
resources synthesizing unwanted and unneeded antibiotic resistance 

TABLE 6.9 Effect of plasmid copy number on host cell growth rate 


£ coli HB101 with plasmid 

Plasmid copy no. 

Relative specific growth rate 

None 

0 

1.00 

A 

12 

0.92 

B 

24 

0.91 

C 

60 

0.87 

D 

122 

0.82 

E 

408 

0.77 


Adapted from Seo and Bailey, Biotechnol. Bioeng. 27:1668-1674,1985. 

The different plasmids, designated A, B, C, D, and E, encode only (3-lactamase and are all the 
same size. The growth rates were normalized to the growth rate value for E. coli HB101 without a 
plasmid. 
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marker gene products. Chromosome integration is particularly important 
when a genetically engineered organism is destined to be released directly 
into the environment. The use of strong but regulatable promoters is also 
an effective means of reducing the metabolic load. In this case, the fermen¬ 
tation process is performed in two stages. During the first, or growth, stage, 
the promoter controlling the transcription of the target gene is turned off, 
while during the second, or induction, stage, this promoter is turned on 
(see chapter 17). 

When the codon usage of the foreign gene is different from the codon 
usage of the host organism, depletion of specific aminoacyl-tRNA pools 
may be avoided by either completely or partially synthesizing the target 
gene to better reflect the codon usage of the host organism. However, since 
this is not a simple procedure, such an approach is likely to be used in only 
a limited number of instances. Nevertheless, in one study, it was found that 
levels of the protein streptavidin were 10-fold higher in E. coli when expres¬ 
sion was directed by a synthetic gene with a G+C content of 54% than when 
it was directed by the natural gene with a G+C content of 69%. 

Although it may at first seem counterintuitive, one way to increase the 
amount of foreign protein produced during the fermentation process is to 
accept a modest level of foreign-gene expression—perhaps 5% of the total 
cell protein—and instead focus on attaining a high host cell density. An 
organism with a 5% foreign-protein expression level and a low level of 
metabolic load that can be grown to a density of 40 grams (dry weight) per 
liter produces more of the target protein than one with a 15% expression 
level for the same protein and a cell density of only 5 to 10 grams (dry 
weight) per liter. 


SUMMARY 


T he production of a protein requires that the gene be prop¬ 
erly transcribed and then that the mRNA be translated. In 
prokaryotes, a promoter region is necessary for the initiation 
of transcription at the correct nucleotide site, and a sequence 
at the end of the gene (a terminator) is essential for the cessa¬ 
tion of transcription. Cloned genes often lack these signals. 
Consequently, for expression of a cloned gene in a prokaryotic 
host cell, the appropriate signals recognized by the host must 
be provided in the correct locations. Moreover, the aim of 
many biotechnology applications is to produce large amounts 
of protein, so it is necessary to use a promoter that supports 
transcription at a high level (strong promoter) and that is com¬ 
patible with the RNA polymerase of the host cell. However, 
continuous transcription of a cloned gene drains the energy 
reserves of the host cell, so it is also necessary to use a pro¬ 
moter system whose activity can be regulated either by the 
addition of a low-molecular-weight compound or by changing 
the growth temperature. In addition to utilizing promoters in 
their native forms, it is possible to alter the DNA to produce 
promoters with a wide range of activities. 

Efficient protein synthesis from a gene depends on specific 
sequences in its mRNA, and often for cloned genes, other 
manipulations are required to ensure the stability of the pro¬ 
tein and, if necessary its secretion. As part of the gene-engi¬ 
neering process, a ribosome-binding site is placed in the DNA 


segment that precedes the translation initiation site (start 
codon), which also may need to be added. Finally a termina¬ 
tion sequence may be placed at the end of the cloned gene to 
ensure that translation stops at the correct amino acid. If secre¬ 
tion of the protein is desired, the DNA sequence preceding the 
cloned gene should include a signal sequence in the same 
reading frame as the target gene. 

Lack of stability of the protein that is encoded by a cloned 
gene is another complication that is often encountered. For 
example, the recombinant protein can be degraded by prote¬ 
olytic enzymes of the host cell. One strategy to overcome this 
problem is to alter the cloned gene so that it encodes one or 
more additional amino acids at its N terminus. In this form, 
the recombinant protein is no longer rapidly degraded. In 
addition, the amino acids that are added to the recombinant 
protein can sometimes be used for purifying the fusion pro¬ 
tein by for example, immunoaffinity column chromatography 
In these cases, the junction point of a fusion protein is usually 
designed to be cleaved in vitro either chemically or enzymati¬ 
cally. Fusion proteins may also be used to facilitate the purifi¬ 
cation of the target protein. 

Most microorganisms that are used to express foreign pro¬ 
teins require oxygen for growth. However, oxygen has only 
limited solubility in water and is rapidly depleted from the 
growth media of actively growing cultures, especially when 
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the cultures attain a high cell density. Attempts to deal with 
the limited amount of dissolved oxygen available for cell 
growth and maintenance have included (1) utilizing microbial 
host strains deficient in the production of some proteolytic 
enzymes that are produced during stationary phase and (2) 
introducing the gene for the Vitreoscilla sp. hemoglobin that 
binds oxygen from the environment, creating a higher level of 
intracellular oxygen and thereby causing increases in both 
host and foreign-protein synthesis. 

By increasing the numbers of copies of the cloned gene, the 
production of the protein is increased, and under these condi¬ 
tions, the stability of the product is often enhanced. However, 
during scale-up of plasmid-based systems, all or parts of the 
plasmid can be lost. This plasmid instability is undesirable for 
commercial systems. To overcome this problem, researchers 
have developed protocols for integrating a cloned gene into a 
chromosomal site of the host organism. Under these condi¬ 
tions, the gene is maintained stably as part of the DNA of the 
host organism. 

The introduction and expression of foreign DNA in a host 
organism often change the metabolism of the organism and 
thereby impair its normal functioning. This phenomenon is 
called a metabolic load. A variety of strategies have been 
developed to minimize the extent of the perturbations caused 
by a metabolic load and at the same time optimize the yield 
of the target protein and the stability of the transformed 
cell. 


Proteins produced by gram-negative bacteria may be engi¬ 
neered so that they are secreted to either the periplasm or the 
external medium by including a sequence encoding a peptide 
that marks the protein for secretion. This generally facilitates 
the purification of the protein and ensures that its expression 
does not interfere with cellular functioning. 

Many bacterial cells used to produce recombinant proteins 
tend to bind to solid surfaces and form biofilms. In this state, 
bacteria are resistant to high levels of antibiotics and limited 
in the amount of foreign protein that they can produce. To 
remedy this, it is possible to genetically engineer bacteria that 
are unable to form biofilms. 

An ideal (plasmid) expression system might include a con¬ 
venient multiple cloning site in three different reading frames 
in which to insert the target gene, a strong but regulatable 
promoter, a selectable marker (to eliminate the large back¬ 
ground of nontransformed cells), a replication origin for 
plasmid maintenance or a site(s) for homologous recombina¬ 
tion for integration of the plasmid into the host chromosomal 
DNA, a secretory signal peptide, and a removable (cleavable) 
fusion tag that facilitates protein stability and eventual purifi¬ 
cation. However, both target proteins and expression systems 
are quite diverse, and investigators tend to develop the set of 
conditions that optimizes the production of a specific protein 
in a particular host cell. The differences in detail notwith¬ 
standing, the same fundamental strategies may be used to 
create a variety of different expression systems. 
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REVIEW QUESTIONS 


1. Suggest several ways that the expression of a cloned gene 
can be manipulated for optimal expression. 

2. What is lacP, and how is it used? 

3. Why is the plasmid that contains the strongest promoter 
not always the best expression vector? 

4. What is the tac promoter, and how is it regulated? 

5. The p L promoter is derived from E. coli bacteriophage X, 
which cannot infect other bacteria, yet it is sometimes used 
as part of a broad-host-range expression vector. Explain how 
the p L promoter can be used to promote transcription in 
organisms other than E. coli. 

6. Sometimes the strategy for the expression of a target pro¬ 
tein in a host organism involves synthesizing the protein as 
part of a fusion protein. Why is this approach useful? How is 
a fusion protein created? 

7. What are inclusion bodies, and how can their formation 
be avoided? 

8. Why would you want to express a foreign protein on the 
surface of a bacterium or bacteriophage? How would you do 
this? 

9. How would you manipulate the DNA sequences 
upstream of a target gene to modulate expression of that 
gene? 


10. How would you avoid some of the problems associated 
with the limited amount of oxygen that is available to 
growing E. coli cells when a foreign protein is overproduced? 

11. A specific target DNA fragment to be integrated into the 
chromosomal DNA of the host organism can include (1) only 
the target gene sequence or (2) the entire plasmid, including 
the target sequence. Explain how each of these results might 
occur. What are the advantages or disadvantages of the 
plasmid vector becoming incorporated into the host chromo¬ 
somal DNA? 

12. What factors are responsible for a metabolic load? 

13. Suggest a number of different strategies to limit the 
extent of metabolic load on E. coli cells that are designed to 
overproduce a recombinant protein. 

14. During the course of integrating a target gene into the 
chromosomal DNA of the host bacterium, a marker gene 
may also be inserted into the chromosomal DNA. What 
strategy could be used to excise only the marker gene? 

15. How do biofilms limit the production of recombinant 
proteins? How might this limitation be overcome? 

16. How can E. coli host cells be engineered so that complex 
proteins with a large number of disulfide bonds are properly 
folded and therefore produced in an active form rather than 
as part of an inclusion body? 
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17. How can E. coli host cells be engineered to yield high 
levels of expression of foreign proteins that contain signifi¬ 
cant numbers of rare E. coli codons? 

18. What is the T7 expression system, and how does it work? 

19. What features affect the strength of a bacterial promoter? 


20. How can a protein of interest be engineered to be 
secreted to the medium by E. coli? 

21. What is an intein, and how is it used to purify foreign 
proteins expressed in E. coli? 
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Heterologous 
Protein Production 
in Eukaryotic Cells 


H eterologous (recombinant) proteins from cloned DNA origi¬ 
nating from a wide variety of organisms have been successfully 
produced using prokaryotic expression systems. Although expres¬ 
sion of any gene from any source organism in a prokaryotic host is theo¬ 
retically possible, in practice, the eukaryotic proteins produced in bacteria 
do not always have the desired biological activity or stability In addition, 
despite careful purification procedures, bacterial compounds that are toxic 
or that cause a rise in body temperature in humans and animals (pyrogens) 
may contaminate the final product. To avoid these problems, investigators 
have developed eukaryotic expression systems in fungal, insect, and mam¬ 
malian cells for the production of uncontaminated therapeutic agents for 
either humans or animals; large quantities of stable, biologically active 
proteins for biochemical, biophysical, and structural studies; and proteins 
for industrial processes. Moreover, any human protein intended for med¬ 
ical use must be identical to the natural protein in all its properties. The 
inability of prokaryotic organisms to produce authentic versions of eukary¬ 
otic proteins is, for the most part, due to improper posttranslational protein 
processing, including improper protein cleavage and folding, and to the 
absence of appropriate mechanisms that add chemical groups to specific 
amino acid acceptor sites. 


Posttranslational Modification of Eukaryotic Proteins 

In prokaryotes, the steps in protein synthesis are not compartmentalized, 
and therefore, translation of messenger RNA (mRNA) occurs concurrently 
with transcription; as soon as the nascent transcript emerges from RNA 
polymerase, it is accessible to the ribosome to begin translation. With the 
aid of folding proteins, known as chaperones, that bind to polypeptides as 
they are being synthesized, proteins are folded into their proper three- 
dimensional configuration during synthesis. In contrast, eukaryotes trans¬ 
port mRNA from the nucleus to ribosomes in the cytoplasm or on the 
endoplasmic reticulum, where translation occurs. Proteins produced on 
ribosomes associated with the endoplasmic reticulum either are inserted in 
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FIGURE 7.1 Cleavage of inactive preproinsulin to yield active mature insulin. 
Proteases remove the leader peptide (L) and an internal peptide (C), yielding a 
peptide that consists of chains A and B. 


the membrane of the endoplasmic reticulum or are secreted into the lumen 
of the endoplasmic reticulum during synthesis, where they are processed 
further. 

Many proteins, including most of those that are of interest as thera¬ 
peutic agents for the treatment of human or animal diseases, undergo some 
type of posttranslational processing that is required for protein activity and 
stability. Some proteins are produced as inactive precursor polypeptides 
that must be cleaved by proteases at specific sites to produce the active 
form of the protein. For example, the small peptide hormone insulin is 
produced in animal pancreatic cells as a single polypeptide, preproinsulin, 
that is cleaved to produce two shorter peptides that are joined by disulfide 
bonds (Fig. 7.1). Production of inactive preproinsulin ensures that the pep¬ 
tide is not active in the pancreatic cells that produce it, but upon secretion, 
cleaved mature insulin can act on other cells. Similarly, the digestive 
enzyme trypsinogen, which degrades proteins, is produced as an inactive 
polypeptide to avoiding digestion of components of the producing cell. 
Upon secretion into the small intestines, trypsinogen is cleaved by an 
enteropeptidase to yield the active enzyme trypsin. 

Similar to prokaryotes, proper folding of proteins in eukaryotic cells 
requires the assistance of chaperones. In the endoplasmic reticulum, the 
chaperones BiP and calnexin bind nascent polypeptides, and protein disul¬ 
fide isomerases catalyze the formation of disulfide bonds between adjacent 
cysteine residues. Proper folding is important, not only for the protein to 
attain a configuration for optimal activity, but also to protect regions of the 
protein that would otherwise be recognized by proteases that destroy the 
protein. Quality control systems ensure that only correctly folded proteins 
are released from the endoplasmic reticulum and transported within vesi¬ 
cles to the Golgi apparatus for further processing. Proteins intended for 
secretion from the cell are subsequently transported to the cell membrane 
within specific transport vesicles and released by exocytosis. 

The addition of specific sugars (glycosylation) to certain amino acids 
is a major modification that provides stability and distinctive binding 
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FIGURE 7.2 Examples of some O-linked 
oligosaccharides in yeasts (A), insects 
(B), and mammals (C). O-linked oligo¬ 
saccharides have a number of arrange¬ 
ments with different combinations of 
sugars. Some of the more prevalent 
forms are shown here. S, serine; T, 
threonine; red circles, mannose; dark- 
blue squares, N-acetylglucosamine; 
light-blue squares, N-acetylgalactos- 
amine; green squares, galactose; orange 
squares, sialic acid. 


properties to a protein. Proper protein glycosylation is important because 
it contributes to protein conformation by influencing protein folding; can 
target a protein to a particular location, for example, through interaction 
with a specific receptor molecule; or can increase protein stability by pro¬ 
tecting it from proteases. In the cell, oligosaccharides are attached to 
newly synthesized proteins in the endoplasmic reticulum and in the Golgi 
apparatus by specific enzymes known as glycosylases and glycosyltrans- 
ferases. Different tissues may differentially glycosylate the same protein, 
thereby increasing protein heterogeneity. Because different sugar modifi¬ 
cations can alter the properties of a protein, this presents opportunities for 
protein engineering to improve the efficacy or to alter the activity of a 
protein. Therapeutic proteins that require glycosylation for activity include 
antibodies, blood factors, some interferons, and some hormones. 

The most common glycosylations entail the attachment of specific 
sugars to the hydroxyl group of either serine or threonine (O-linked glyco¬ 
sylation) (Fig. 7.2) and to the amide group of asparagine (N-linked glyco¬ 
sylation) (Fig. 7.3). About 50% of all human proteins are glycosylated. The 
initial core sugar groups that are added to these amino acid acceptor sites 
tend to be similar among eukaryotes, although the subsequent elaborations 
among yeasts, insects, and mammals are quite diverse, especially for 
N-linked glycosylation. Other amino acid modifications include phospho¬ 
rylation, acetylation, sulfation, acylation, y-carboxylation, and the addition 
of C 14 and C 16 fatty acids, i.e., myristoylation (or myristylation) and palmi- 
toylation (or palmitylation), respectively. 

Unfortunately, there is no universally effective eukaryotic host cell that 
performs the correct modifications on every protein. In some cases, a host 
cell may add unusual sugars to either authentic or spurious amino acid 
sites and, consequently, create an extremely antigenic protein or possibly 
one that lacks its proper function. However, even though a recombinant 
protein may fall short of the stringent properties that are required for a 
therapeutic agent, it may still be useful for either research or industrial 
purposes. Different eukaryotic expression systems must be tested to deter¬ 
mine which one synthesizes the largest amount of a functional recombinant 
protein. The choice of an expression system depends primarily on the 
quality of the recombinant protein that is produced, but the yield of 
product, ease of use, and cost of production and purification are also 
important considerations. 


General Features of Eukaryotic Expression Systems 

The basic requirements for expression of a target protein in a eukaryotic 
host are similar to those required in prokaryotes. Vectors into which the 
target gene is cloned for delivery into the host cell can be specialized plas¬ 
mids designed to be maintained in the eukaryotic host, such as the yeast 
2pm plasmid; host-specific viruses, such as the insect baculovirus; or artifi¬ 
cial chromosomes, such as the yeast artificial chromosome (YAC). The 
vector must have a eukaryotic promoter that drives the transcription of the 
cloned gene of interest, eukaryotic transcriptional and translational stop 
signals, a sequence that enables polyadenylation of the mRNA, and a 
selectable eukaryotic marker gene (Fig. 7.4). Because recombinant DNA 
procedures are technically difficult to carry out with eukaryotic cells, most 
eukaryotic vectors are shuttle vectors with two origins of replication and 
selectable marker genes. One set functions in the bacterium Escherichia coli, 
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FIGURE 7.3 Examples of some N-linked oligosaccharides in yeasts (A), insects (B), 
and mammals (C). All N-linked glycosylations in eukaryotes start with the same 
initial group, which is subsequently trimmed and then elaborated in diverse ways 
within and among species. Some yeast sites have 15 or fewer mannose units (core 
series), and others have more (outer-chain family). In S. cerevisiae, the chains fre¬ 
quently have 50 or more mannose units. An asparagine (N) residue next to any 
amino acid (X) followed by either threonine (T) or serine (S) can be targeted for 
glycosylation. Red circles, mannose; dark blue squares, N-acetylglucosamine; 
yellow triangles, glucose; green squares, galactose; orange squares, sialic acid; 
maroon triangle, fucose. 


and the other set functions in the eukaryotic host cell. If a eukaryotic 
expression vector is to be used as a plasmid, i.e., as extrachromosomal rep¬ 
licating DNA, then it must also have a eukaryotic origin of replication. 
Alternatively, if the vector is designed for stable integration into the host 
chromosomal DNA, then it must have a sequence that is complementary to 
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FIGURE 7.4 Generalized eukaryotic 
expression vector. The major features 
of a eukaryotic expression vector are a 
eukaryotic transcription unit with a 
promoter (p), a multiple cloning site 
(MCS) for a gene of interest, and a 
DNA segment with termination and 
polyadenylation signals (t); a eukary¬ 
otic selectable marker (ESM) gene 
system; an origin of replication that 
functions in the eukaryotic cell (on euk ); 
an origin of replication that functions 
in E. coli (on' E ); and an E. coli selectable 
marker (Amp r ) gene. 


a segment of host chromosomal DNA to facilitate insertion into a chromo¬ 
somal site. 

The introduction of DNA into bacterial and yeast cells is called trans¬ 
formation. In these systems, the term describes an inherited change due to 
the acquisition of exogenous (foreign) DNA. However, in animal cells, 
transformation refers to changes in the growth properties of cells in culture 
after they become cancerous. To avoid confusion, the word transfection has 
been chosen to denote inherited changes in animal cells that are due to the 
addition of exogenous DNA. 

Three techniques are commonly used to transform yeasts: electro¬ 
poration, lithium acetate treatment, and cell wall removal (protoplast for¬ 
mation). Transfection of cultured animal cells is achieved by incubating 
cells with DNA that has been coprecipitated with either calcium phosphate 
or diethylaminoethyl (DEAE)-dextran or by electroporation. Electroporation 
entails subjecting cells to short pulses of electric current, thus creating tran¬ 
sient pores through which DNA enters the cell (Fig. 3.33). Viruses, lipid— 
DNA complexes, and protein-DNA aggregates are also used to transfer 
exogenous DNA into a recipient animal cell. 


Fungus-Based Expression Systems 

Fungi share many of the molecular, genetic, and biochemical features of 
other, "higher" eukaryotes and are therefore a good choice for heterologous 
protein production. They have growth advantages similar to those of 
prokaryotes, such as rapid growth in low-cost medium; generally do not 
require growth factors to be added to the growth medium; can correctly 
process eukaryotic proteins; and can secrete large amounts of heterologous 
proteins. Initially, the yeast Saccharomyces cerevisiae was used extensively as 
a host cell for the expression of cloned eukaryotic genes. It has a long his¬ 
tory of use in traditional biotechnologies in the brewing and baking indus¬ 
tries. Today, a variety of fungal expression systems are available, and they 
have been optimized for recombinant protein expression. Versatile expres¬ 
sion vectors with broad host ranges have been constructed because the 
optimal host for production of a particular target protein must often be 
determined experimentally in a number of different systems. 

Saccharomyces cerevisiae Expression Systems 

High levels of recombinant protein production have been achieved using 
S. cerevisiae. The advantages of using this single-celled yeast are several. 
First, a great deal is known about the biochemistry, genetics, and cell 
biology of the fungus. The genome sequence of S. cerevisiae was completed 
in 1996, and it is used extensively in studies as a model organism for cell 
function. It can be grown rapidly to high cell densities on relatively simple 
media in both small culture vessels and large-scale bioreactors. Second, 
several strong promoters have been isolated from the yeast and character¬ 
ized, and a naturally occurring plasmid, called the 2pm plasmid, can be 
used as part of an endogenous yeast expression vector system. Third, S. 
cerevisiae is capable of carrying out many posttranslational modifications. 
Fourth, the yeast normally secretes so few proteins that, when it is engi¬ 
neered for extracellular release of a recombinant protein, the product can 
be easily purified. Fifth, because of its years of use in the baking and 
brewing industries, S. cerevisiae has been listed by the U.S. Food and Drug 


Heterologous Protein Production in Eukaryotic Cells 245 


Administration as a "generally recognized as safe" organism. It does not 
harbor human pathogens or produce fever-stimulating pyrogens. Therefore, 
the use of the organism for the production of human therapeutic agents 
(drugs or pharmaceuticals) does not require the same extensive experimen¬ 
tation demanded for unapproved host cells. A number of proteins that 
have been produced in S. cerevisiae are currently being used commercially 
as vaccines, pharmaceuticals, and diagnostic agents (Table 7.1). For 
example, at present, more than 50% of the world supply of insulin is pro¬ 
duced by S. cerevisiae. Engineered S. cerevisiae strains are also major pro¬ 
ducers of a hepatitis B vaccine. 

S. cerevisiae vectors. There are three main classes of S. cerevisiae expression 
vectors: episomal, or plasmid, vectors (yeast episomal plasmids [YEps]); 
integrating vectors (yeast integrating plasmids [Yips]); and YACs. Of these, 
episomal vectors have been used extensively for the production of either 
intra- or extracellular heterologous proteins. Typically, the vectors contain 
features that allow them to function in both bacteria and S. cerevisiae. An E. 
coli origin of replication and bacterial antibiotic resistance genes are usually 
included on the vector, enabling all manipulations to first be performed in 
E. coli before the vector is transferred to S. cerevisiae for expression. 

The YEp vectors are based on the high-copy-number 2pm plasmid, a 
small, circular plasmid found in most natural strains of S. cerevisiae. The 
vector replicates independently of the host chromosome via a single origin 
of replication (autonomous replicating sequence [ARS]/STB loci), and is 
maintained in more than 30 copies per cell. Many S. cerevisiae selection 


TABLE 7.1 Recombinant proteins produced by 
S. cerevisiae expression systems 

Vaccines 

Hepatitis B virus surface antigen 
Malaria circumsporozoite protein 
HIV-1 envelope protein 

Diagnostics 

Hepatitis C virus protein 
HIV-1 antigens 

Human therapeutic agents 
Epidermal growth factor 
Insulin 

Insulin-like growth factor 
Platelet-derived growth factor 
Proinsulin 

Fibroblast growth factor 

Granulocyte-macrophage colony-stimulating factor 
-Antitrypsin 

Blood coagulation factor XHIa 
Hirudin 

Human growth factor 
Human serum albumin 


HIV-1, human immunodeficiency virus type 1. 
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TABLE 7.2 Promoters for S. cerevisiae expression vectors 


Promoter 

Expression conditions 

Status 

Acid phosphatase (PH05) 

Phosphate-deficient medium 

Inducible medium 

Alcohol dehydrogenase I (ADHI) 

2-5% Glucose 

Constitutive 

Alcohol dehydrogenase II (ADH11) 

0.1-0.2% Glucose 

Inducible 

Cytochrome Cj (CYC1) 

Glucose 

Repressible 

Gal-l-P Glc-l-P uridyltransferase 

Galactose 

Inducible 

Galactokinase (GAL1) 

Galactose 

Inducible 

Glyceraldehyde-3-phosphate 
dehydrogenase (GAPD, GAPDH) 

2-5% Glucose 

Constitutive 

Metallothionein (CUP1) 

0.03-0.1 mM Copper 

Inducible 

Phosphoglycerate kinase (PGK) 

2-5% Glucose 

Constitutive 

Triosephosphate isomerase (TPI) 

2-5% Glucose 

Constitutive 

UDP-galactose epimerase (GAL10) 

Galactose 

Inducible 


schemes rely on mutant host strains that require a particular amino acid 
(histidine, tryptophan, or leucine) or nucleotide (uracil) for growth. Such 
strains are said to be auxotrophic because minimal growth medium must 
be supplemented with a specific nutrient. In practice, the vector is equipped 
with a functional (wild-type) version of a gene that complements the 
mutated gene in the host strain. For example, when a YEp with a wild-type 
LEU2 gene is transformed into a mutant leu2 host cell and plated onto 
medium that lacks leucine, only cells that carry the plasmid will grow. 

A number of promoters derived from S. cerevisiae genes are available for 
engineering efficient transcription of heterologous genes in yeast vectors 
(Table 7.2). Generally, tightly regulatable, inducible promoters are preferred 
for producing large amounts of recombinant protein at a specific time 
during large-scale growth. In this context, the galactose-regulated pro¬ 
moters respond rapidly to the addition of galactose with a 1,000-fold 
increase in transcription. Repressible, constitutive, and hybrid promoters 
that combine the features of different promoters are also available. Maximal 
expression depends on efficient termination of transcription. Often, for YEp 
vectors, the terminator sequence is from the same gene as the promoter. 

Many heterologous genes are provided with a DNA coding sequence 
for an amino acid segment (signal sequence, signal peptide, or leader 
sequence) that facilitates the passage of the recombinant protein through 
cell membranes and its release to the external environment. The main 
reason for this modification is that it is much easier to purify a secreted 
protein than one from a cell lysate. The most commonly used signal 
sequence for S. cerevisiae is derived from the mating factor a gene. Also, 
synthetic leader sequences have been created to increase the amount of 
secreted protein. Other sequences that stabilize the recombinant protein, 
protect it from proteolytic degradation, and provide a specific amino acid 
sequence (affinity tag) that is used for selective purification can be fused 
onto the coding sequence of the heterologous gene. These extra amino acid 
sequences are equipped with a protease cleavage site so that they can be 
removed from the recombinant protein after it is purified. 

Plasmid-based yeast expression systems are often unstable under large- 
scale (>10 liters) growth conditions even in the presence of selection pres¬ 
sure. To remedy this problem, a heterologous gene is integrated into the host 
genome to provide a more reliable production system. Different approaches 
have been devised for the integration of a cloned gene together with a 
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selectable marker gene into an S. cerevisiae chromosome. Briefly, a functional 
selectable marker gene and a heterologous gene equipped with yeast-spe¬ 
cific transcription and translation control sequences are inserted between 
two DNA segments derived from the ends of a nonessential yeast gene. In 
this instance, the plasmid does not usually carry an origin of replication that 
functions in yeast cells. The plasmid is cleaved, and the linear fragment is 
transformed into S. cerevisiae. A double recombination event between 
homologous sequences on the linearized plasmid and a chromosome in the 
host inserts the piece of DNA with both target and marker genes into a spe- 


FIGURE 7.5 Schematic representation of integration of DNA with a Yip vector. A 
selectable marker gene (LEU2) and a gene of interest (GOI) with transcription and 
translation control elements (not shown) are inserted into a Yip vector between two 
segments from the ends of a nonessential yeast gene (A1 and AT). The ampicillin 
resistance (Amp r ) gene and the origin of replication (on' E ) function in E. coli. A leu¬ 
cine-requiring (leu2) yeast strain is transformed with restriction endonuclease- 
digested (RE) vector DNA because chromosomal DNA is more likely to recombine 
with linearized DNA than with circular DNA. The restriction endonuclease sites 
flank the segments from the nonessential gene. The DNA sequences at the ends of 
nonessential gene A undergo recombination (x) that leads to the incorporation of 
both the gene of interest and the LEU2 gene into the corresponding chromosome 
site. Transformants grow on medium that is not supplemented with leucine. 
Nonrecombined DNA is degraded. 
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cific chromosome site (Fig. 7.5). The plasmid DNA is linearized because 
DNA in this form is more likely than circular DNA to recombine with chro¬ 
mosome DNA. The DNA that is not integrated is lost during successive cell 
divisions. The major drawback of this strategy is the low yield of recombi¬ 
nant protein from a single gene copy. 

To increase the number of copies of an integrated heterologous gene 
and thereby increase the overall yield of the recombinant protein, the gene 
can be targeted to yeast repetitive DNA sequences, such as the sequences 
(ribosomal DNA [rDNA]) encoding ribosomal RNA that are present in 
approximately 200 copies. Alternatively, the heterologous gene can be inte¬ 
grated into several of the 400 copies of 8 sequences in the S. cerevisiae 
genome. The 8 sequences are parts of nonessential DNA elements derived 
from retrotransposons. Briefly, retrotransposons are chromosomal DNA 
sequences that are transcribed but not translated into protein. The RNA 
molecules act as templates for the synthesis of DNA by the enzyme reverse 
transcriptase, and the copied DNA sequences integrate into the chromo¬ 
some at different sites. In one study, 10 copies of a heterologous gene were 
inserted into 8 sequences and produced a significant amount of the recom¬ 
binant protein. 

A YAC is designed to clone a large segment of DNA (100 kilobase pairs 
[kb]), which is then maintained as a separate chromosome in the host yeast 
cell. The YAC system is highly stable and has been used for the physical 
mapping of human genomic DNA, the analysis of large transcription units, 
and the formation of genomic libraries containing DNA from individual 
human chromosomes. A YAC vector mimics a chromosome because it has 
a sequence that acts as an origin of DNA replication (ARS), a yeast cen¬ 
tromere sequence to ensure that after cell division each daughter cell 
receives a copy of the YAC, and telomere sequences that are present at both 
ends after linearization of the YAC DNA for stability (Fig. 7.6). In some 
cases, the input DNA is cloned into a site that disrupts a yeast marker gene. 
In the absence of the product of the marker gene, a colorimetric response is 
observed when recipient cells are grown on a specialized medium. 
Alternatively, some YAC vectors contain a selectable marker gene that is 
independent of the cloning site. To date, YACs have not been used as 
expression systems for the commercial production of heterologous pro¬ 
teins, although they have the potential to produce large amounts of either 
a single protein from multiple copies of the same gene or a heterologous 
protein with different subunits. 

Intracellular production of heterologous proteins in S. cerevisiae. Most S. 
cerevisiae intracellular expression systems have the same basic features. 
Here, the production of the human enzyme superoxide dismutase (SOD) 
will be used to illustrate the process. Superoxide anion is a by-product of 
oxygen utilization in aerobic organisms. In humans, this anion helps both 
to stimulate the inflammatory response of phagocytes and to direct leuko¬ 
cytes to the site of an infection. However, too much of the molecule and its 
derivatives can cause cellular damage. To minimize these potentially cyto¬ 
toxic effects, the naturally occurring cytoplasmic enzyme Cu/Zn-SOD 
scavenges the superoxide radical and combines it with a hydrogen ion to 
form hydrogen peroxide, which in turn is degraded to water and oxygen 
by catalase or peroxidase. Superoxide anion is also produced when blood 
is allowed to reenter an organ (reperfusion) after it has been deprived of 
blood during a surgical procedure. To prevent this source of superoxide 
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FIGURE 7.6 YAC cloning system. The YAC plasmid (pYAC) has an E. coli selectable 
marker (Amp r ) gene; an origin of replication that functions in E. coli (on' E ); and yeast 
DNA sequences, including URA3, CEN, TRP1, and ARS. CEN provides centromere 
function, ARS is a yeast autonomous replicating sequence that is equivalent to a 
yeast origin of replication, URA3 is a functional gene of the uracil biosynthesis 
pathway, and TRP1 is a functional gene of the tryptophan biosynthesis pathway. 
The T regions are yeast chromosome telomeric sequences. The Smal site is the 
cloning insertion site. pYAC is first treated with Smal, BamHI, and alkaline phos¬ 
phatase and then ligated with size-fractionated (100-kb) input DNA. The final 
construct carries cloned DNA and can be stably maintained in double-mutant ura3 
and trpl cells. 


anion damage, clinicians have speculated that Cu/Zn-SOD could be 
administered to an organ as it is being reperfused. In addition, Cu/Zn-SOD 
might act as a therapeutic agent against inflammatory diseases, such as 
osteoarthritis, rheumatoid arthritis, scleroderma, and ankylosing spon¬ 
dylitis. For both of these uses, an authentic human form of Cu/Zn-SOD is 
preferred to avoid any adverse immunological responses that might result 
from using an enzyme from another species. 

Initially, a complementary DNA (cDNA) for human Cu/Zn-SOD was 
cloned into an E. coli expression system. As expected, the E. coli host cells 
removed the initiator N-terminal methionine from the Cu/Zn-SOD pro- 
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tein. However, the next amino acid (alanine) was not acetylated, as it is in 
human cells. To produce a fully functional protein, the human Cu/Zn-SOD 
cDNA was cloned into a YEp vector (Fig. 7.7). This YEp vector contained 
(1) a 2pm plasmid origin of DNA replication, (2) a yeast gene for leucine 
biosynthesis (LEU2) for selection in yeast, (3) an E. coli origin of replication, 
(4) the ampicillin resistance (Amp r ) gene for selection in E. coli, and (5) the 
human Cu/Zn-SOD cDNA inserted between the promoter region of the 
yeast glyceraldehyde phosphate dehydrogenase gene (GAPDp) and a 
sequence containing the signals for transcription termination and polyade- 
nylation of mRNA from the same gene (GAPDt). A leucine-defective (leu2) 
yeast strain was transformed with the vector, and the cells were plated onto 
medium that lacked leucine. Only cells with the functional LEU2 gene, 
which was supplied by the vector, could grow under these conditions. The 
GAPD promoter is transcribed continuously (constitutively) during cell 
growth. In this experiment, the yeast cells produced high levels of intracel¬ 
lular Cu/Zn-SOD that, like the authentic protein from human cells, had an 
acetylated N-terminal alanine residue. 

Secretion of heterologous proteins by S. cerevisiae. All glycosylated pro¬ 
teins of S. cerevisiae are secreted, and each must have a leader sequence to 
pass through the secretory system. Consequently, the coding sequences of 
recombinant proteins that require either O-linked or N-linked sugars for 
biological activity must be equipped with a leader sequence. Usually, the 
leader sequence from the yeast mating type a-factor gene (prepro-a-factor) 
is inserted immediately in front (upstream) of the cDNA of the gene of 
interest. Under these conditions, correct disulfide bond formation, prote¬ 
olytic removal of the leader sequence, and appropriate posttranslational 
modifications often occur, and an active recombinant protein is secreted. 
During this process, the leader peptide is removed by an endoprotease that 
recognizes the dipeptide Lys-Arg. The Lys-Arg codons must be located 
adjacent to the cDNA sequence so that, following removal of the leader 
peptide, the recombinant protein will have the correct amino acid residue 

FIGURE 7.7 S. cerevisiae expression vector. The cDNA for human Cu/Zn-SOD was 
cloned between the promoter (GAPDp) and termination-polyadenylation sequence 
(GAPDt) of the S. cerevisiae glyceraldehyde phosphate dehydrogenase gene. The 
LEU2 gene that was cloned between segments of the yeast 2|xm plasmid DNA 
encodes a functional enzyme of the leucine biosynthesis pathway. The yeast origin 
of replication is included in the 2pm plasmid DNA. The ampicillin resistance 
(Amp r ) gene and the E. coli origin of replication (on' E ) are derived from plasmid 
pBR322. 
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Synthesis of Rabbit [3-Globin in Cultured Monkey 
Kidney Cells Following Infection with a SV40 [3-Globin 
Recombinant Genome 

R. C. Mulligan, B. H. Howard, and P. Berg 
Nature 277:108-114, 1979 


C onceptually, the development 
of a eukaryotic expression 
system appears to be a rela¬ 
tively simple matter of assembling the 
appropriate regulatory sequences, 
cloning them in the correct order into 
a vector, and then putting the gene of 
interest into the precise location that 
enables it to be expressed. In reality, 
the development of the first genera¬ 
tion of eukaryotic expression vectors 
was a painstaking process following a 


trial-and-error approach. Before 
Mulligan, Howard, and Berg's study, a 
number of genes had been cloned into 
the mammalian SV40 vectors, but 
mature, functional mRNAs were never 
detected after infection of host cells. 
This problem was overcome by 
inserting the rabbit cDNA for P-globin 
into an SV40 gene that had nearly all 
of its coding region deleted but 
retained "all the regions implicated in 
transcriptional initiation and termina¬ 


tion, splicing and polyadenylation...." 
Both rabbit p-globin mRNA and pro¬ 
tein were synthesized in cells that 
were transfected with this P-globin 
CDNA-SV40 construct. Mulligan et al. 
concluded, "The principal conceptual 
innovation is the decision to leave 
intact the regions of the vector impli¬ 
cated in...mRNA processing." This 
study established that an effective 
eukaryotic expression system could be 
created by placing the cloned gene 
under the control of transcription and 
translation regulatory sequences. It 
also stimulated additional research 
that pinpointed in detail the structural 
prerequisites for the next generation of 
eukaryotic expression vectors. 


at its N terminus. For example, a properly processed and active form of the 
protein hirudin was synthesized and secreted by an S. cerevisiae strain con¬ 
taining a YEp vector that had the prepro-a-factor sequence added to the 
hirudin coding sequence. The gene for hirudin is from an invertebrate, the 
leech Hirudo viedicinalis. This protein is a powerful blood anticoagulant that 
is not immunogenic in humans. 

Over the last 10 years, the amount of heterologous protein that can be 
produced per liter of yeast culture has increased 100-fold (from about 0.02 
to 2 g/liter). This increase is mainly due to improvements in growing cul¬ 
tured cells to high cell densities; the level of protein produced per cell has 
remained largely unchanged. Although there have been significant advances 
in techniques to increase the number of copies of a target gene in a host cell 
and to increase the expression levels of these genes, the overexpressed pro¬ 
teins tend to form intracellular aggregates, often associated with molecular 
chaperones, rather than to be secreted into the medium, which facilitates 
purification. Major problems that must be addressed to increase heterolo¬ 
gous-protein secretion in yeast cells are the incorrect folding of the poly¬ 
peptide, the activation of cellular mechanisms to cope with the stress of 
protein overproduction, and the aberrant processing and release of the 
protein of interest from the endoplasmic reticulum. 

One of the major reasons for producing a recombinant protein for use 
in human therapeutics in yeast rather than in bacteria is to ensure that the 
protein is processed correctly following synthesis. Correct protein folding 
occurs in the endoplasmic reticulum in eukaryotes and is facilitated by a 
number of different proteins, including molecular chaperones, enzymes for 
disulfide bond formation, signal transduction proteins that monitor the 
demand and capacity of the protein-folding machinery, and proteases that 
clear away improperly folded or aggregated proteins (Fig. 7.8). The eukary¬ 
otic enzyme protein disulfide isomerase is instrumental in forming the 
correct disulfide bonds within a protein. Aberrant disulfide bond forma¬ 
tion changes a protein's configuration, which abolishes protein activity and 
causes instability. Poor yields of overexpressed proteins often occur because 
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FIGURE 7.8 Summary of protein folding in the endoplasmic reticulum of yeast cells. 
During synthesis on ribosomes associated with the endoplasmic reticulum (ER), 
nascent proteins are bound by the chaperones BiP and calnexin, which aid in the 
correct folding of the protein. Protein disulfide isomerases (PDI) catalyze the forma¬ 
tion of disulfide bonds between cysteine amino acids that are nearby in the folded 
protein. Quality control systems ensure that only correctly folded proteins are 
released from the ER. Proteins released from the ER are transported to the Golgi 
apparatus for further processing. Prolonged binding of BiP to misfolded proteins 
leads to activation of the S. cerevisiae transcription factor Had, which controls the 
expression of several proteins that mediate the unfolded-protein response (UPR). 
Adapted from Gasser et al., Microb. Cell Fact. 7:11-29, 2008. 


the capacity of the cell to properly fold and secrete proteins has been 
exceeded. Several strategies have therefore been implemented to increase 
the host cell's capacity to process higher than normal levels of proteins. For 
example, the overproduction of molecular chaperones and protein disul¬ 
fide isomerases might increase the yield of recombinant proteins, especially 
those with disulfide bonds. To test this hypothesis, the yeast protein disul¬ 
fide isomerase gene was cloned between the constitutive glyceraldehyde 
phosphate dehydrogenase promoter and a transcription terminator 
sequence in a Yip vector, and the entire construct was integrated into a 
chromosomal site. The modified strain showed a 16-fold increase in protein 
disulfide isomerase production compared with the wild-type strain. When 
protein disulfide isomerase-overproducing cells were transformed with a 
YEp vector carrying the gene for human platelet-derived growth factor B, 
there was a 10-fold increase in the secretion of recombinant protein over 
that of transformed cells with normal levels of protein disulfide isomerase. 
The overproduction of protein disulfide isomerase specifically increases 
the secretion of proteins with disulfide bonds. Higher levels of secreted 
products are also obtained for the recombinant proteins human erythropoi¬ 
etin, bovine prochymosin, and leech hirudin in S. cerevisiae cells that over¬ 
express the chaperone BiP. 

Overexpression of the molecular chaperone BiP or protein disulfide 
isomerase increased the secretion of some heterologous protein; however, 
overexpression of a single chaperone may not have the desired outcome 
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and, in some instances, may increase the degradation of the target protein. 
This is because proper protein folding requires the coordinated efforts of 
many interacting factors (Fig. 7.8). Even when levels of one chaperone are 
adequate, the levels of cochaperones or cofactors may be limiting. The 
unfolded-protein response of yeast cells coordinates the expression of sev¬ 
eral chaperones, as well as cochaperones. When the demand for protein 
folding exceeds the folding capacity of the endoplasmic reticulum, the 
unfolded-protein response increases the expression of chaperones, protein 
disulfide isomerase, and other proteins involved in protein secretion. 
Engineering the proteins of the unfolded-protein response may be a better 
approach to increase the overall capacity of the cell to fold proteins in a 
coordinated way by maintaining appropriate ratios of all factors required. 
Accumulation of unfolded proteins in the endoplasmic reticulum activates 
the S. cerevisiae transcription factor Had, which activates the expression of 
proteins of the unfolded-protein response, and therefore expression of 
Had was targeted for genetic manipulation. Overexpression of Had in S. 
cerevisiae improved secretion of the important industrial enzyme a-amylase, 
which is used for starch hydrolysis in a wide range of processes, such as 
alcohol production, paper recycling, and oil drilling. 

Pichia pastoris Expression Systems 

Recombinant proteins have been produced successfully in S. cerevisiae from 
cloned genes from many sources. However, in many cases, expression 
levels are low and protein yields are modest. One of the major drawbacks 
of using S. cerevisiae is the tendency for the yeast to hyperglycosylate heter¬ 
ologous proteins by typically adding 50 to 150 mannose residues in 
N-linked oligosaccharide side chains that often alter protein function. 
Although the initial stages of addition of glycan chains to proteins in the 
lumen of the endoplasmic reticulum are similar in yeast and humans, fol¬ 
lowing transfer of the protein to the Golgi apparatus, further processing 
differs significantly. The outcome is the production of a sialylated protein 
in humans and a hypermannosylated protein in yeast, with a-1,3 bonds 
between the sugar residues that can make the heterologous protein anti¬ 
genic (Fig. 7.3). Also, proteins that are designed for secretion frequently are 
retained in the periplasmic space, increasing the time and cost of purifica¬ 
tion. Finally, S. cerevisiae produces ethanol at high cell densities, which is 
toxic to the cells (the Crabtree effect) and, as a consequence, lowers the 
quantity of secreted protein. For these reasons, researchers have examined 
other yeast species and eukaryotic cells that could act as effective host cells 
for recombinant protein production. 

P. pastoris is a methylotrophic yeast that is able to utilize methanol as a 
source of energy and carbon. It is an attractive host for recombinant protein 
production because glycosylation occurs to a lesser extent and the linkages 
between sugar residues are of the a-1,2 type, which are not allergenic to 
humans. With these natural characteristics as a starting point, a P. pastoris 
strain was extensively engineered with the aim of developing a "human¬ 
ized" strain that glycosylates proteins in a manner identical to that of 
human cells. Both human and yeast cells add the same small (10-residue), 
branched oligosaccharide to nascent proteins in the endoplasmic reticulum 
(Fig. 7.9). However, this is the last common precursor between the two cell 
types, because once the protein is transported to the Golgi apparatus, fur¬ 
ther processing is different. In the Golgi apparatus, yeast cells add an a-1,6 
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FIGURE 7.9 Differential processing of glycoproteins in P. pastoris, humans, and 
"humanized" P. pastoris. Initial additions of sugar residues to glycoproteins in the 
endoplasmic reticulum are similar in human and P. pastoris cells (A). However, 
further N glycosylation in the Golgi apparatus differs significantly between the two 
cell types. N-glycans are hypermannosylated in P. pastoris (B), while in humans, 
mannose residues are trimmed and specific sugars are added, leading to termina¬ 
tion of the oligosaccharide in sialic acid (C). P. pastoris cells have been engineered to 
produce enzymes that process glycoproteins in a manner similar to that of human 
cells. In "humanized" P. pastoris, a recombinant glycoprotein produced in the endo¬ 
plasmic reticulum (D) is transported to the Golgi apparatus, where it is further 
processed to yield a properly sialylated glycoprotein (E). Blue squares, 
N-acetylglucosamine; red circles, mannose; green squares, galactose; orange 
squares, sialic acid. Adapted from Hamilton and Gerngross, Curr. Opin. Biotechnol. 
18:387-392, 2007. 


mannose residue to the oligosaccharide, which subsequently leads to 
hypermannosylation. Mammalian cells, on the other hand, remove some 
mannose residues from the precursor (trimming) and then sequentially 
add specific sugars to yield a glycoprotein with an oligosaccharide that 
terminates in sialic acid. To create a "humanized" strain, the enzyme 
responsible for addition of the a-1,6 mannose was first eliminated from P. 
pastoris to prevent hypermannosylation. Next, the gene encoding a man¬ 
nose-trimming enzyme (a mannosidase) from the filamentous fungus 
Trichoderma reesei was inserted into the yeast genome and was found to trim 
the oligosaccharide to a human-like precursor. Genes encoding enzymes 
for the sequential addition of sugar residues that terminate the oligosac¬ 
charide chains in galactose were also added. It should be noted that the 
coding sequences for all engineered genes contained a secretion signal for 
localization of the encoded protein to the Golgi apparatus. Finally, several 
genes for proteins that catalyze the synthesis, transport to the Golgi appa¬ 
ratus, and addition of sialic acid to the terminal galactose on the protein 
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precursor were inserted into the P. pastoris genome. Properly sialylated 
recombinant proteins that can be used as human therapeutic agents have 
been produced by the "humanized" P. pastoris, including erythropoietin 
and antibodies. 

During growth on methanol, enzymes required for catabolism of this 
substrate are expressed at very high levels with alcohol oxidase, the first 
enzyme in the methanol utilization pathway, encoded by the gene AOX1, 
representing as much as 30% of the cellular protein. Transcription of AOX1 
is tightly regulated; in the absence of methanol, the AOX1 gene is com¬ 
pletely turned off but responds rapidly to the addition of methanol to the 
medium. Therefore, the AOX1 promoter is an excellent candidate for pro¬ 
ducing large amounts of recombinant protein under controlled conditions. 
Moreover, the induction of the cloned gene can be timed to maximize 
recombinant protein production during large-scale fermentations. In con¬ 
trast to S. cerevisiae, P. pastoris does not synthesize ethanol, which can limit 
cell yields; therefore, very high cell densities of P. pastoris are attained, with 
the concomitant secretion of large quantities of protein. P. pastoris normally 
secretes very few proteins, thus simplifying the purification of secreted 
recombinant proteins. 

Many P. pastoris expression vectors have been devised, each one having 
more or less the same format. The basic features include a gene of interest 
under the control of promoter and transcription termination sequences 
from the P. pastoris AOX1 gene, an E. coli origin of replication and selectable 
marker gene, and a yeast selectable marker gene (Fig. 7.10). The addition of 
a signal sequence from either the P. pastoris phosphatase PHOl gene or 
another yeast gene facilitates the secretion of a recombinant protein. To 
avoid the problems of plasmid instability during long-term growth, most 
P. pastoris vectors are designed to be integrated into the host genome, usu¬ 
ally within the AOX1 gene, the HIS4 gene for histidine biosynthesis, or 
rDNA. Both the engineered gene of interest and a yeast selectable marker 
gene are inserted together into a specific chromosome site by either a single 
(Fig. 7.11A) or a double (Fig. 7.11B) recombination event. The P. pastoris 
expression system has been used to produce more than 100 different bio¬ 
logically active proteins from bacteria, fungi, invertebrates, plants, and 
mammals, including humans. Many of these proteins, such as the hepatitis 
B virus surface antigen, human serum albumin, and bovine lysozyme, are 
identical to the native proteins and thus authentic. 

Other Yeast Systems 

Authentic heterologous proteins for industrial and pharmaceutical uses 
have also been generated in other yeasts. For example, the cDNAs for the 
a- and (3-globin chains of human hemoglobin A were each cloned between 
the methanol oxidase promoter (MOXp) and transcription terminator 
(MOXt) sequences of the methylotrophic yeast Hansenula polymorpha and 
placed in tandem in an expression vector. Fortuitous integration into a 
chromosome yielded an isolate that produced functional hemoglobin A 
that had the correct tetrameric organization of the two a- and two (3-globin 
chains (a 2 (3 2 ). Also, large amounts of the animal feed enzyme supplement 
phytase have been produced by transformed H. polymorpha. 

The thermotolerant dimorphic yeasts Arxula adeninivorans and Yarrowia 
lipolytica have demonstrated promising potential as hosts for high levels of 
heterologous-protein expression. These yeasts can grow at temperatures up 


FIGURE 7.10 P. pastoris integrating 
expression vector. The gene of interest 
(GOI) is cloned between the promoter 
(AOXlp) and termination-polyadeny- 
lation sequence (AOXlt) of the P. pas¬ 
toris alcohol oxidase 1 gene. The HIS4 
gene encodes a functional histidinol 
dehydrogenase of the histidine biosyn¬ 
thesis pathway. The ampicillin resis¬ 
tance (Amp 1 ) gene and an origin of 
replication (on' E ) function in E. coli. The 
segment marked 3' AOX1 is a piece of 
DNA from the 3' end of the alcohol 
oxidase 1 gene of P. pastoris. A double 
recombination event between the 
AOXlp and 3' AOX1 regions of the 
vector and the homologous segments 
of chromosome DNA results in the 
insertion of the DNA carrying the gene 
of interest and the HIS4 gene. 
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FIGURE 7.11 Integration of DNA into a specific P. pastoris chromosome site by single 
(A) or double (B) recombination. (A) A single recombination (dashed line) between 
the HIS4 gene of an intact circular plasmid and a chromosome his4 mutant gene 
results in the integration of the entire vector, including the gene of interest (GOI) 
with the AOX1 promoter in the 5' AOX1 DNA segment and the transcription- 
polyadenylation sequence from the AOX1 gene (TT), into the chromosome. The 
inserted DNA is flanked by recombined mutant his4 and functional HIS4 genes. The 
dot in the his4 gene represents the mutation. (B) A double recombination (dashed 
lines) between the cloned 5' AOX1 and 3' AOX1 DNA segments of a restriction 
endonuclease (RE) linearized DNA fragment from the vector and the corresponding 
chromosome regions results in the integration of the gene of interest (GOI) with the 
AOX1 promoter in the 5' AOX1 segment, the termination-polyadenylation sequence 
from the AOX1 gene (TT), and a functional HIS4 gene. The chromosome AOX1 
coding region is lost as a result of the recombination event. 


to 48°C and can survive at higher temperatures (55°C) for several hours. At 
higher temperatures, the fungi grow in a mycelial form and revert to bud¬ 
ding cells below 42°C. Some secreted proteins, such as glucoamylase and 
invertase, are produced at higher levels in mycelia. Cell morphology also 
influences posttranslational modification, with O-linked glycosylation pre¬ 
dominating in budding cells while N glycosylation occurs in both mycelial 
and budding cells. An additional advantage of A. adeninivorans is the ability 
to grow on a wide range of inexpensive carbon and nitrogen sources. 

Stable chromosomal integration systems have been developed for A. 
adeninivorans, including a promising system based on complementation of 
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a tryptophan auxotroph. An expression cassette was constructed with the 
target a-amylase gene (amyA) from the bacterium Bacillus amyloliquefaciens 
flanked by the A. adeninivorans TEF1 promoter, the S. cerevisiae PH05 termi¬ 
nator, and the selectable marker ATRP1 that restores tryptophan biosyn¬ 
thesis in an A. adeninivorans tryptophan-auxotrophic strain carrying a 
deletion in the chromosomal copy of ATRP1 (Fig. 7.12A). The 25S rDNA 
sequence was included on the vector with the expression cassette for tar¬ 
geted integration into the yeast chromosome. Following transformation of 
A. adeninivorans with the vector and selection on a medium that did not 
contain tryptophan, transformants were isolated that synthesized trypto¬ 
phan, produced high levels of a-amylase, and carried a single copy of the 
vector with the expression cassette within a chromosomal 25S rDNA site. 
To increase the number of copies of the expression cassette integrated into 
the chromosome, a defective promoter was used to drive expression of the 
ATRP1 selectable marker (Fig. 7.12B). The integration of multiple copies of 
the cassette compensated for the low levels of ATRP1 expressed from this 


FIGURE 7.12 Constructs for stable integration of target genes into a chromosome of 
the yeast A. adeninivorans. (A) The target gene (e.g., the amyA gene) is inserted into 
a vector between the TEF1 promoter (p) and the PH05 terminator (t), and the vector 
is introduced into a strain of A. adeninivorans that is unable to synthesize trypto¬ 
phan. The vector is integrated into a chromosome by homologous recombination 
between chromosomal and vector 25S rDNA sequences, and expression of the 
ATRP1 gene driven by a strong promoter restores tryptophan biosynthesis, enabling 
survival of the yeast on media lacking tryptophan. (B) Expression of low levels of 
ATRP1 from a defective promoter results in chromosomal integration of multiple 
copies of the expression cassette. In this construct, the expression cassette, which 
consists of the target gene and the ATRP1 gene, is inserted in the middle of the 25S 
rDNA sequence so that, following a double recombination event, only the expres¬ 
sion cassette is integrated into the A. adeninivorans genome. Sequences for maintain- 
ance (on' E ) and selection (Amp r ) in E. coli are included on the vector. 
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rDNA module ARS module Expression module Selection module 


FIGURE 7.13 A wide-range yeast vector system for expression of heterologous genes 
in several different yeast hosts. The basic vector contains a multiple cloning site 
(MCS) for insertion of selected modules containing appropriate sequences for chro¬ 
mosomal integration (rDNA module), replication (ARS module), selection (Selection 
module), and expression (Expression module) of a target gene in a variety of yeast 
host cells (Table 7.3 shows examples of interchangeable modules). Sequences for 
maintenance (on E ) and selection (Amp r ) of the vector in E. coli are also included. 


promoter and resulted in sufficient synthesis of tryptophan to enable 
growth on medium lacking tryptophan, integration of eight copies of 
amyA, and up to five-fold-higher levels of a-amylase activity. Moreover, by 
inserting the ATRP1 and amyA genes in the middle of the 25S rDNA 
sequence on the vector, following a double recombination event, only the 
expression cassette was integrated into the host chromosome; sequences 
present on the vector that are required for initial manipulation in £. coli 
were excluded. 

It is often necessary to try several host types in order to find the one 
that produces the highest levels of a biologically active recombinant pro¬ 
tein. Differences in the processing and productivity of a particular protein 
can occur among different yeast strains. For example, both S. cerevisiae and 
H. polymorpha produced a truncated version of the heterologous protein 
interleukin-6 (IL-6), whereas A. adeninivorans produced a full-length ver¬ 
sion of the protein. The construction of a wide-range yeast vector for 
expression in several fungal species has facilitated this trial-and-error pro¬ 
cess (Fig. 7.13). The basic vector contains features for propagation and 
selection in £. coli and a multiple cloning site for insertion of interchange¬ 
able modules that are chosen for a particular yeast host, including a 
sequence for vector integration into the fungal genome, a suitable origin of 
replication, a promoter to drive expression of the heterologous gene, and 
selectable markers to complement a range of nutritional auxotrophies or to 
confer resistance to antifungal compounds, such as hygromycin B (Table 
7.3). In other words, by selecting from a range of available modules, cus- 
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tomized vectors can be rapidly and easily constructed for expression of the 
same gene in several different yeast cells to determine which host is optimal 
for heterologous-protein production. 

Filamentous Fungal Systems 

Distinct from unicellular yeasts, filamentous fungi are multicellular, micro¬ 
scopic fungi that produce long, branching strands of cells called hyphae. 
This group of fungi includes the common mold genera Penicillium, Rhizopus, 
Trichoderma, and Aspergillus. Many species of these genera of filamentous 
fungi are a rich natural resource for commercially important metabolites 
and enzymes, such as the antibiotic penicillin, the cancer chemotherapeutic 
agent taxol, and the cellulose-degrading enzyme cellulase, and have also 
been used as cell factories for the production of recombinant proteins for the 
food, beverage, pulp and paper, and pharmaceutical industries (Table 7.4). 
Similar to yeast, filamentous fungi can grow rapidly on inexpensive media, 
secrete large amounts of proteins, process eukaryotic mRNA, and carry out 
many posttranslational modifications. However, an additional advantage of 
using filamentous fungi as hosts for the production of mammalian proteins 
is their ability to add mammalian-like sugars to proteins, in contrast to 
yeasts, which typically add sugars with high mannose content. 

Several vectors have been constructed with appropriate transcription 
and translation control elements for the expression of recombinant proteins 
in filamentous fungi, and some of these are commercially available. To 
achieve high yields, multiple copies of the target gene are expressed under 
the control of a strong promoter. Commonly used promoters include the 
regulatable promoter from the cellobiohydrolase I gene (cbhl) from T. reesei 
or the glucoamylase A gene (glaA) from Aspergillus niger, or the strong con¬ 
stitutive promoter from the glyceraldehyde-3-phosphate dehydrogenase 
gene (gpdA) from Aspergillus nidulans. However, yields of recombinant pro¬ 
teins are low, owing in part to degradation by extracellular proteases. The 
use of protease-deficient strains and gene fusion constructs has improved 
the output of recombinant proteins to a limited extent. The fusion partner 
is often a secreted protein that can carry the target protein with it to the 
secretory apparatus. It may also protect the target protein from degrada¬ 
tion. When the human IL-6 gene was initially expressed in A. nidulans from 
the A. niger glaA promoter, protein production was very low; however, 
when it was expressed from the same promoter as a fusion with the 
secreted glucoamylase A protein, IL-6 yields increased 200-fold, although 
the yields were still too low for commercial viability, possibly due to deg¬ 
radation by host proteases (Table 7.5). Expression of IL-6 from the strong, 
constitutive gpdA promoter in a protease-deficient host improved the pro¬ 
tein yields, but only in a mutant host that was unable to acidify the culture 
medium. The increased yields of recombinant protein at higher pH were 
attributed to reduced protease activity and altered fungal morphology. At 
higher pH, the fungus formed small mycelial pellets that contained more 
live cells. 

This system has been used successfully to produce another human 
protein, the a r protcinase inhibitor, which blocks the activity of neutrophil 
elastase in the lungs. Individuals deficient in a r proteinase inhibitor can 
develop fatal emphysema because they are unable to prevent elastase 


TABLE 7.3 Examples of modules 
available for wide-range yeast vector 
systems 


Module and gene 

Donor 

Integration 

25S rDNA 

A. adeninivorans 

18S rDNA 

A. adeninivorans 

18S rDNA 

H. polymorpha 

Replication 

2pm ARS 

S. cerevisiae 

ARS1 

S. cerevisiae 

HARS 

H. polymorpha 

Selection 

URA3 

S. cerevisiae 

LEU2 

S. cerevisiae 

ALEU2m 

A. adeninivorans 

ATRPlm 

A. adeninivorans 

HIS4 

P. past oris 

Expression 

MOX promoter 

H. polymorpha 

AOX1 promoter 

P. pastoris 

TEF1 promoter 

A. adeninivorans 

GAA promoter 

A. adeninivorans 

RPS7 promoter 

Y. lipolytica 


Adapted from Gellissen et al., FEMS Yeast 
Res., 5:1079-1096, 2005. 

HARS, Hansenula ARS. 
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TABLE 7.4 Some recombinant proteins produced by filamentous fungal expression systems 


Recombinant protein 

Host cell 

Main application 

a-Amylase 

A. niger, Aspergillus oryzae 

Starch processing, food industry 

Aspartyl protease 

A. nidulans, A. oryzae 

Food industry 

Cellulase 

T. reesei 

Textile, pulp and paper industries 

Chymosin 

A. niger 

Food industry 

Immunoglobulin G 

A. niger 

Pharmaceutical industry 

Insulin 

A. niger 

Pharmaceutical industry 

Interleukin-6 

A. niger 

Pharmaceutical industry 

Laccase 

A. niger, T. reesei 

Textile, pulp and paper industries 

Manganese peroxidase 

A. niger 

Chemical industry 

Lactoferrin 

A. oryzae 

Pharmaceutical industry 

Lipase, thermophilic 

A. oryzae 

Detergent 

Lysozyme 

A. niger 

Pharmaceutical industry 

Phytase 

T. reesei 

Food industry 

Xylanase 

A. niger, T. reesei 

Textile, pulp and paper, food industries 


from damaging lung tissue. Purified a,-proteinase inhibitor is used for 
replacement therapy; however, it is difficult to obtain sufficient quantities 
from blood plasma, where it is normally found, and there is a risk of trans¬ 
ferring infectious agents with the blood product, making production in 
fungal hosts an attractive alternative. A 1,230-base-pair cDNA encoding 
the mature human a,-proteinase inhibitor gene was cloned into an expres¬ 
sion vector under the control of the gpdA promoter and fused to the glu- 
coamylase A gene to facilitate the secretion of oq-proteinase inhibitor (Fig. 
7.14). The nucleotide sequence encoding the recognition site for a specific 
host endoprotease was included between the oq-proteinase inhibitor gene 
and the fusion partner for in vivo cleavage of the fusion protein and 
release of oq-proteinase inhibitor. Expression of this construct in protease- 
deficient, nonacidifying A. niger resulted in the secretion of active, stable, 
glycosylated oq-proteinase inhibitor to the culture medium, demonstrating 
the efficacy of using filamentous fungal hosts to produce human proteins 
of therapeutic value. 

Transformation of filamentous fungi may be achieved by a variety of 
methods, including (1) using protoplasts that have had their cell walls 
removed to facilitate DNA uptake, (2) Agrobacterium -mediated transfer of a 
vector carrying the target gene in a manner similar to that used to transform 
plants, (3) electroporation, and (4) biolistic transformation, which "shoots" 
the target DNA into the cell on a gold or tungsten particle. Not all of these 


TABLE 7.5 Production of human interleukin-6 in filamentous fungi 


Host cell 

Relevant host trait 

Promoter (donor) 

Fusion partner 

Yield (mg/liter) 

A. nidulans 


glaA (A. niger) 


<0.1 



glaA (A niger) 

glaA 

5 

A. niger 


glaA 


<0.1 


Protease deficient 

gpdA 


<0.1 


Protease deficient 

gpdA 

glaA 

2 


Protease deficient. 

gpdA 

glaA 

10 


nonacidifying 





Adapted from Punt et al., Trends Biotechnol. 20:200-206, 2002. 
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methods have been successful for all filamentous fungi, and the best strategy 
to transform a given species must be determined experimentally. 

While targeted integration of a foreign gene into the genome of the 
yeast S. cerevisiae via recombination between homologous nucleotide 
sequences on the vector and in the host genome usually occurs at a high 
frequency, similar approaches in filamentous fungi have been less suc¬ 
cessful. An alternative approach exploits another recombination system 
found in many eukaryotes, including filamentous fungi, known as the 
nonhomologous-end-joining pathway. In this pathway, the Ku protein 
binds to the ends of introduced linear DNA and, with the DNA ligase 
IV-Xrcc4, integrates the fragment carrying the foreign gene at random sites 
in the fungal host chromosome. This system does not require homology 
between the introduced DNA and the integration site. Interestingly, when 
the nonhomologous-end-joining pathway is disrupted through mutations 
in Ku or the DNA ligase, targeted integration through homologous recom¬ 
bination is improved. 

In sum, fungal expression systems play an important role in the pro¬ 
duction of heterologous proteins for research, industrial, and medical 
applications. However, experience has shown that no one system is able to 
produce an authentic version of every heterologous protein. For this and 
other reasons, gene expression systems that use insect or mammalian cells 
have been developed. 

Baculovirus-lnsect Cell Expression Systems 

Baculoviruses are a large, diverse group of viruses that specifically infect 
arthropods, including many insect species, and are not infectious to other 
animals. During the infection cycle, two forms of baculovirus are produced 
(Fig. 7.15). The infection is initiated by the occluded form of the virus. In 
this form, the viral nucleocapsids (virions) are clustered in a matrix that is 
made up of the protein polyhedrin. The occluded virions packaged in this 
protein matrix are referred to as a polyhedron and are protected from inac¬ 
tivation by environmental agents. Once the virus is taken up into the 
midgut of the insect, usually through ingestion of contaminated plant 
material, the polyhedrin matrix dissolves due to the alkaline gut environ¬ 
ment, and the virions enter midgut cells to begin the infection cycle in the 
nucleus. Within the insect midgut, the infection can spread from cell to cell 
as viral particles (single nucleocapsids) bud off from an infected cell and 
infect other midgut cells. This form of the virus, known as the budding 
form, is not embedded in a polyhedrin matrix and is not infectious to other 
individual insect hosts, although it can infect cultured insect cells. Plaques 
produced in insect cell cultures by the budding form of baculovirus have a 


FIGURE 7.14 Construct for expression and secretion of the human oq-proteinase 
inhibitor in the filamentous fungus A. niger. The expression cassette includes the 
strong constitutive promoter gpdAp, the transcriptional terminator from the TrpC 
gene (TrpCt), the cDNA encoding glucoamylase to facilitate secretion, and the 
coding sequence for the Kex2 recognition site for in vivo removal of the glucoamy¬ 
lase fusion protein by the host Kex2 endoprotease. 


gdpAp Glycoamylase KEX2 
cDNA cleavage 


Human oq-proteinase 
inhibitor cDNA 


TrpCt 
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different morphology from the occluded form. During the late stages of the 
infection cycle in the insect host, about 36 to 48 hours after infection, the 
polyhedrin protein is produced in massive quantities and continues for 4 
to 5 days, until the infected cells rupture and the host organism dies. 
Occluded virions are released and can infect new hosts. 

The promoter for the polyhedrin (polyh) gene is exceptionally strong, 
and transcription from this promoter can account for as much as 25% of the 
mRNA produced in cells infected with the virus. Moreover, the polyhedrin 
protein is not required for virus production. Consequently, it was reasoned 
that replacement of the polyhedrin gene with a coding sequence for a het¬ 
erologous protein, followed by infection of cultured insect cells, would 
result in the production of large amounts of the heterologous protein. 
Furthermore, because of the similarity of posttranslational modification 
systems between insects and mammals, it was thought that the recombi¬ 
nant protein would mimic closely, if not precisely, the authentic form of the 
original protein. On the basis of these premises, a baculovirus-insect cell 
expression vector system was devised. Baculoviruses have been highly suc¬ 
cessful as delivery systems for introducing target genes for production of 
heterologous proteins in insect cells. More than a thousand different pro¬ 
teins have been produced using this system, including enzymes, transport 
proteins, receptors, and secreted proteins, for a variety of applications 
(Table 7.6). 


FIGURE 7.15 Budded (A) and occluded (B) forms of AcMNPV. During budding, a 
nucleocapsid becomes enveloped by the membrane of an infected cell. A polyhe¬ 
dron consists of clusters of nucleocapsids (occluded virions) embedded in various 
orientations in a polyhedrin matrix. 
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The specific baculovirus that has been used extensively as an expres¬ 
sion vector is Autographa californica multiple nuclear polyhedrosis virus 
(AcMNPV). A. californica (the alfalfa looper) and over 30 other insect spe¬ 
cies are infected by AcMNPV. This virus also grows well on many insect 
cell lines. The most commonly used cell line for genetically engineered 
AcMNPV is derived from the fall armyworm, Spodopterafrugiperda. In these 
cells, the polyhedrin promoter is exceptionally active, and during infec¬ 
tions with wild-type baculovirus, high levels of polyhedrin are synthe¬ 
sized. To a lesser extent, the Bombyx mori nuclear polyhedrosis virus has 
been used to express heterologous proteins in silkworm larvae. 

Baculovirus Expression Vectors 

The first step in the production of a recombinant AcMNPV that will be 
used to deliver the gene of interest into the insect host cell is to create a 
transfer vector. The transfer vector is an E. coli- based plasmid that carries a 
segment of DNA from AcMNPV (Fig. 7.16) consisting of the polyhedrin 
promoter region and an adjacent portion of upstream AcMNPV DNA, a 
multiple cloning site, the polyhedrin termination and polyadenylation 
signal regions, and an adjacent portion of downstream AcMNPV DNA. 
The coding region for the polyhedrin gene has been deleted from this block 
of DNA. The upstream and downstream AcMNPV DNA segments included 
on the transfer vector provide regions for homologous recombination with 
AcMNPV. A gene of interest is inserted into the multiple cloning site 
between the polyhedrin promoter and termination sequences, and the 
transfer vector is propagated in E. coli. 

Next, insect cells in culture are cotransfected with AcMNPV DNA and 
the transfer vector carrying the cloned gene. Within some of the doubly 
transfected cells, a double-crossover recombination event occurs at homol¬ 
ogous polyhedrin gene sequences on the transfer vector and in the 
AcMNPV genome, and the cloned gene with polyhedrin promoter and 
termination regions becomes integrated into the AcMNPV DNA (Fig. 7.17) 
with the concomitant loss of the polyhedrin gene. Virions lacking the poly¬ 
hedrin gene produce distinctive zones of cell lysis (occlusion-negative 
plaques), from which recombinant baculovirus is isolated. 

DNA hybridization or a polymerase chain reaction (PCR) assay can be 
used to confirm the presence of recombinant baculovirus. To facilitate the 


TABLE 7.6 Some of the recombinant proteins that have been produced by the baculovirus expression 
vector system 


a-Interferon 
Adenosine deaminase 
Anthrax antigen 
P-Amyloid precursor protein 
P-Interferon 
Bovine rhodopsin 
Cystic fibrosis transmembrane 
conductance regulator 
Dengue virus type 1 antigen 
Erythropoietin 


G-protein-coupled receptors 
HIV-1 envelope protein 
HSV capsid proteins 
Human alkaline phosphatase 
Human DNA polymerase a 
Human pancreatic lipase 
Influenza virus hemagglutinin 
Interleukin-2 
Lassa virus protein 


Malaria proteins 
Mouse monoclonal antibodies 
Multidrug transporter protein 
Poliovirus proteins 
Pseudorabies virus glycoprotein 50 
Rabies virus glycoprotein 
Respiratory syncytial virus antigen 
Simian rotavirus capsid antigen 
Tissue plasminogen activator 


HIV-1, human immunodeficiency virus type 1; HSV, herpes simplex virus. 
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FIGURE 7.16 Organization of the expression unit of a baculovirus (AcMNPV) transfer 
vector. The gene of interest is inserted into the multiple cloning site (MCS) that lies 
between the polyhedrin gene promoter (Pp) and polyhedrin gene transcription 
termination (Pt) sequences. The AcMNPV DNA upstream from the polyhedrin 
promoter (5' AcMNPV DNA) and downstream from the polyhedrin transcription 
termination sequence (3' AcMNPV DNA) provides sequences for integration of the 
expression unit by homologous recombination into an AcMNPV genome. 


detection of recombinant plaques, the E. coli lacZ gene, which encodes 
p-galactosidase, is put under the control of a baculovirus promoter that is 
turned on during the early to late stages of the infection cycle, and this 
construct is made part of the DNA unit that is incorporated into the 
AcMNPV genome. Recombinant plaques turn blue when a chromogenic 
substrate for p-galactosidase is added to the medium. Additional rounds of 
infection of insect cells with viruses from occlusion-negative plaques 
increase the concentration of recombinant viruses. Heterologous protein is 
harvested 4 to 5 days after host insect cells are infected with a high-titer 
recombinant baculovirus stock. 

Increasing the Yield of Recombinant Baculovirus 

The identification of occlusion-negative plaques is subjective, and purifica¬ 
tion of recombinant baculovirus is tedious due to the low frequency of 
recombination (-0.1%) between the AcMNPV DNA and the transfer 
plasmid; therefore, improvements have been made to the original proce¬ 
dure to increase the frequency of recombinant viruses. A very simple yet 
effective procedure that increases the frequency of recombinant plaques to 
about 30% involves linearization of the AcMNPV genome before transfec¬ 
tion into insect cells. Linearized baculovirus genomes have reduced infec- 
tivity, decreasing the number of nonrecombinant plaques, and a 
double-crossover event between a linearized AcMNPV DNA molecule and 
a circular transfer vector establishes a closed circular DNA molecule that is 
infective. 

The linearization method was further developed to generate a system 
that produces an even higher frequency of recombinant baculovirus 
plaques. The AcMNPV genome was engineered with two Bsu36I sites that 
were placed on either side of the polyhedrin gene (Fig. 7.18). One is in gene 
603 (open reading frame 603 [ORF603]), and the other is in a gene (ORF1629) 
that is essential for viral replication. When DNA from this modified bacu¬ 
lovirus is treated with Bsu36I and transfected into insect cells, no viral 
replication occurs because a segment of the essential gene (ORF1629) is 
missing. As part of this system, a transfer vector is constructed with the 
gene of interest between intact versions of gene 603 and gene 1629. This 
transfer vector is introduced into insect cells that were previously trans¬ 
fected with linearized, replication-defective AcMNPV DNA that is missing 
the segment between the two Bsu36I sites. A double-crossover event both 
reestablishes a functional version of ORF1629 and incorporates the cloned 
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Transfer vector 




Recombinant AcMNPV 


FIGURE 7.17 Replacement of the AcMNPV polyhedrin gene with an expression unit 
from a transfer vector. A double crossover event (x) between homologous DNA 
segments of the transfer vector and the AcMNPV genome results in the integration 
of the expression unit into the AcMNPV genome. GOI, gene of interest; Pp, polyhe¬ 
drin promoter; Pt, polyhedrin polyadenylation and termination sequence. 


gene into the AcMNPV genome (Fig. 7.18). With this system, over 90% of 
the baculovirus plaques are recombinant. 

Integration of Target Genes into Baculovirus by Site-Specific 
Recombination 

To eliminate the need to use plaque assays to identify and purify recombi¬ 
nant viruses, several methods have been developed that introduce the 
target gene into the baculovirus genome at a specific nucleotide sequence 
by recombination, either in an intermediate bacterial host, such as E. coli, or 
in vitro outside of a host cell. Transfection of insect cells is required only for 
the production of the heterologous protein. AcMNPV DNA can be main¬ 
tained in £. coli as a plasmid known as a bacmid, which is a baculovirus- 
plasmid hybrid molecule. In addition to AcMNPV genes, the bacmid 
contains an origin of replication for propagation in E. coli, a kanamycin 
resistance gene for selection of the bacmid, and an integration site (attach¬ 
ment site) that is inserted into the lacZ' gene without impairing its function 
(Fig. 7.19A). Another component of this system is the transfer vector that 
carries the gene of interest cloned between the polyhedrin promoter and a 
terminator sequence. In the transfer vector, the target gene expression unit 
(expression cassette) and a gentamicin resistance gene are flanked by DNA 
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FIGURE 7.18 Production of recombinant baculovirus. Single Bsu36I sites are engi¬ 
neered into gene 603 and a gene (1629) that is essential for AcMNPV replication. 
These genes flank the polyhedrin gene in the AcMNPV genome. After a baculovirus 
with two engineered Bsu36I sites is treated with Bsu36I, the segment between the 
Bsu36I sites is deleted. Insect cells are cotransfected with a Bsu36I-treated baculo¬ 
virus DNA and a transfer vector with a gene of interest under the control of the 
promoter (p) and terminator (t) elements of the polyhedrin gene and the complete 
sequences of both genes 603 and 1629. A double crossover event (dashed lines) 
generates a recombinant baculovirus with a functional gene 1629. With this system, 
almost all of the progeny baculoviruses are recombinant. 


attachment sequences that can bind to the attachment site in the bacmid 
(Fig. 7.19B). An ampicillin resistance gene lies outside the expression cas¬ 
sette for selection of the transfer vector. 

Bacterial cells carrying a bacmid are cotransformed with the transfer 
vector and a helper plasmid that encodes the specific proteins (transposi- 
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tion proteins) that mediate recombination between the attachment sites on 
the transfer vector and on the bacmid and that carries a tetracycline resis¬ 
tance gene. After recombination, the DNA segment that is bounded by the 
two attachment sites on the transfer vector (the expression cassette carrying 
the target gene) is transposed into the attachment site on the bacmid, 
destroying the reading frame of the lacZ' gene (Fig. 7.19C). Consequently, 
bacteria with recombinant bacmids produce white colonies in the presence 
of isopropyl-(3-D-thiogalactopyranoside (IPTG) and 5-bromo-4-chloro-3- 
indolyl-p-D-galactopyranoside (X-Gal). Moreover, white colonies that are 
resistant to kanamycin and gentamicin and sensitive to both ampicillin and 
tetracycline carry only a recombinant bacmid and no transfer and helper 
plasmids. After all of these manipulations, the integrity of the cloned gene 
can be confirmed by PCR. Finally, recombinant bacmid DNA can be trans¬ 
fected into insect cells, where the cloned gene is transcribed and the heter¬ 
ologous protein is produced. 

Another method exploits the bacteriophage A integration system for 
site-specific recombination of A DNA into a host bacterial genome. All 
genetic manipulations to construct the baculovirus expression vector are 
carried out in vitro before transferring the recombinant virus into the insect 
host for selection and production of the heterologous protein. First, the 
AcMNPV genome was engineered to include two different integration 
sequences (attachment sites attRl and attR2) derived from bacteriophage A 
that flank the f/c gene encoding the enzyme thymidine kinase from herpes 
simplex virus (Fig. 7.20A). Thymidine kinase converts the synthetic nucle¬ 
otide analogue ganciclovir into a product that is toxic to insect cells and 
was included in the engineered AcMNPV construct for subsequent selec¬ 
tion against nonrecombinant viruses. The attRl-f/c-attR2 DNA segment 
(cassette) was placed downstream of the polyhedrin gene promoter. Next, 
the target gene was inserted into a donor plasmid between two attachment 
sites, attLl and attL2, that correspond to the attachment sites in the engi¬ 
neered AcMNPV genome (Fig. 7.20B). Integration of the target gene into 
the AcMNPV genome occurs by recombination between the attachment 
sites flanking the target gene on the plasmid and flanking the f/c gene in the 
AcMNPV genome. Site-specific recombination between the attachment 
sequences is mediated by the addition of integration enzymes (integrase, 
excisionase, and integration host factor) derived from bacteriophage A to 
the in vitro reaction mixture (Fig. 7.20C). Insect cells are then transfected 
with the recombinant baculovirus for selection and propagation. Because 
recombination results in excision of the thymidine kinase gene from the 
baculovirus genome, only cells transfected with viruses carrying the target 
gene will survive in the presence of ganciclovir. 


Mammalian Glycosylation and Processing of Precursor Proteins in 
Insect Cells 

Although insect cells can process proteins in a manner similar to that of 
higher eukaryotes, some mammalian proteins produced in S.frugiperda cell 
lines are not authentically glycosylated. For example, insect cells do not 
normally add galactose and terminal sialic acid residues to N-linked glyco¬ 
proteins. Where these residues are normally added to mannose residues 
during the processing of some proteins in mammalian cells, insect cells will 
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FIGURE 7.19 Construction of a recombinant bacmid. (A) An E. coli plasmid is incor¬ 
porated into the AcMNPV genome by a double crossover event (dashed lines) 
between DNA segments (5' and 3') that flank the polyhedrin gene to create a shuttle 
vector (bacmid) that replicates in both E. coli and insect cells. The gene for resistance 
to kanamycin (Kan r ), an attachment site (att) that is inserted in frame in the lacZ' 
sequence, and an E. coli origin of replication (ori E ) are introduced as part of the 
plasmid DNA. (B) The transposition proteins encoded by genes of the helper 
plasmid facilitate the integration (transposition) of the DNA segment of the transfer 
vector that is bounded by two attachment sequences (attR and attL). The gene for 
resistance to gentamicin (Gen r ) and a gene of interest (GOI) that is under the control 
of the promoter (p) and transcription terminator (t) elements of the polyhedrin gene 
are inserted into the attachment site (att) of the bacmid. The helper plasmid and 
transfer vector carry the genes for resistance to tetracycline (Tet r ) and ampicillin 
(Amp r ), respectively. (C) The recombinant bacmid has a disrupted lacZ' gene (*). 
The right-angled arrow denotes the site of initiation of transcription of the cloned 
gene after transfection of the recombinant bacmid into an insect cell. Cells that are 
transfected with a recombinant bacmid are not able to produce functional 
P-galactosidase. 
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trim the oligosaccharide to produce paucimarmose (Fig. 7.21). Consequently, 
because these residues are usually important for the proper functioning of 
a protein and improperly glycosylated mammalian proteins may elicit an 
allergic response when used as human therapeutic agents, the baculovirus 
system cannot be used for the production of several important mammalian 
glycoproteins. Host insect cell lines have been constructed with an inte¬ 
grated mammalian (3-1,4-galactotransferase gene and a mammalian a-2,6- 
sialyltransferase gene under the control of a promoter that is active during 
the early stages of the baculovirus infection cycle. Under test conditions, 
this system synthesized N-linked glycans with both galactose and sialic 
acid; however, the position of the sugar residues in the oligosaccharide 
chain was not identical to that found on the natural human glycoprotein. 
To ensure the production of "humanized" glycoproteins with accurate gly- 
cosylation patterns, an insect cell line was constructed to express five dif¬ 
ferent mammalian glycosyltransferases (Fig. 7.21). Because insect cells do 
not normally produce cytidine monophosphate (CMP)-sialic acid, the sub¬ 
strate for sialyltransferase and donor molecules for sialic acid residues, cell 
lines have also been constructed to express sialic acid synthases that pro¬ 
duce CMP-sialic acid from N-acetylmannosamine provided in the culture 
medium. Although cell lines have been improved by the addition of mam- 


FIGURE 7.20 Construction of a recombinant baculovirus expression vector in vitro. 
(A) The AcMNPV genome was engineered to carry the thymidine kinase gene (tk) 
between attachment sequences attRl and attR2 and downstream of the baculovirus 
polyhedrin gene promoter (Pp). (B) The target gene is cloned into a donor plasmid 
between attachment sequences attLl and attL2. (C) The AcMNPV DNA carrying 
the tk gene and the donor plasmid are combined in a test tube, and integration 
enzymes are added to catalyze the recombination between corresponding attR and 
attL sites. Recombination results in the replacement of the tk gene with the target 
gene in the AcMNPV DNA so that, following transfection with the recombinant 
AcMNPV DNA, the insect cells are resistant to the toxic effects of ganciclovir and 
express the target gene from the polyhedrin gene promoter. 
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malian N-glycan-processing capabilities, in some cases it may be preferable 
to express the protein of interest in a mammalian host. 

Further improvements to prevent undesirable processing of heterolo¬ 
gous proteins in insect cells using the baculovirus expression system are 
the removal of the genes encoding chitinase and the protease v-cathepsin 
from the AcMNPV genome. v-Cathepsin is normally produced late in the 
infection cycle to facilitate the release of new virions from the insect host. It 
also reduces the yield of heterologous proteins through proteolytic 
cleavage. Chitinase is produced in conjunction with v-cathepsin and is 
thought to function in the proper folding of v-cathepsin and in the degra¬ 
dation of the host exoskeleton. It is secreted at very high levels from bacu- 
lovirus-infected insect host cells and can compete with secreted target 
proteins for the secretory apparatus, thereby reducing yields of the target 
protein. Coexpression of chaperones to ensure proper folding of the target 
protein has also resulted in increased yields of functional heterologous 
proteins. 

Production of Multiprotein Complexes Using Baculovirus 

In many cases, the baculovirus-insect expression system is used to express 
a single protein of interest. For example, it has been recently tested for the 
production of hemagglutinin, the dominant antigenic protein on the sur¬ 
face of the influenza virus that has potential to be used as a vaccine against 
influenza infection (subunit vaccines are discussed in chapter 12). In mice 
and humans, the purified recombinant hemagglutinin vaccine produced 
using baculovirus in insect cells was found to be safe and to provide a high 
level of protection against influenza virus infection. Influenza vaccines in 
current use are inactivated or attenuated forms of the whole virus; how¬ 
ever, protection against influenza virus using these vaccines is short-lived 


FIGURE 7.21 N glycosylation of proteins in the Golgi apparatus of insect, human, and 
"humanized" insect cells. While the sugar residues added to N-glycoproteins in the 
endoplasmic reticulum are similar in insect and human cells, further processing in 
the Golgi apparatus yields a trimmed oligosaccharide (paucimannose) in insect 
cells and an oligosaccharide that terminates in sialic acid in human cells. To pro¬ 
duce recombinant proteins for use as human therapeutic agents, "humanized" 
insect cells have been engineered to express several enzymes that process human 
glycoproteins accurately. Blue squares, N-acetylglucosamine; red circles, mannose; 
green squares, galactose; orange squares, sialic acid. 





Paucimannose 


Sialylated N-glycans 











Heterologous Protein Production in Eukaryotic Cells 


271 


because the antigenic surface proteins, such as hemagglutinin, change rap¬ 
idly, giving rise to new influenza virus strains against which existing vac¬ 
cines are not effective. An important advantage of producing a recombinant 
subunit vaccine using baculovirus is the relative ease with which genes 
such as the hemagglutinin gene can be cloned and expressed. This allows 
the rapid and flexible production of vaccines that change on a frequent 
basis. Currently, whole-virus vaccines must be propagated in chicken eggs 
and therefore have long production times and are less amenable to change 
once production has begun. Moreover, because production using eggs is 
avoided, baculovirus vaccines do not contain egg proteins that can stimu¬ 
late an allergic response in some individuals. 

The simultaneous expression of two or more cloned genes can lead to 
the formation of functional multimeric protein complexes. This can be 
accomplished by cotransfecting insect cells with multiple baculoviruses, 
each engineered to express one target protein, or, using a more manageable 
approach, by transfection with a single recombinant virus expressing mul¬ 
tiple proteins. AcMNPV is particularly amenable to carrying large inser¬ 
tions, up to 38 kb, with several foreign genes due to its flexible envelope. 
An important example is the production of vaccines known as virus-like 
particles using AcMNPV. Virus-like particles are comprised of the assem¬ 
bled protein coat of the virus but without the nucleic acid genome (Fig. 
7.22A), and while these particles are often infectious (i.e., they can enter 
cells), they cannot replicate and therefore do not cause disease. One of the 
shortcomings of using single antigenic proteins as vaccines is that they 
often have poor immunogenicity, that is, they do not elicit a strong immune 
response. Researchers have shown that protein vaccines that more closely 
mimic the overall structure of a virus particle, such as virus-like particles, 
evoke a stronger response and therefore provide greater protection against 
subsequent infection. Generally, multiple genes encoding the proteins that 
make up the virus-like particle are cloned into a transfer vector, and these 
are then incorporated into the AcMNPV DNA via site-specific recombina¬ 
tion either in vitro, as described above, or in £. coli. In the latter case, rather 
than supplying purified enzymes required for recombination in vitro, the 
recombinase is produced from a gene in the £. coli genome. In one study, 
the genes for three different envelope structural proteins from the human 
severe acute respiratory syndrome coronavirus (SARS-Co V) were expressed 
simultaneously at a high level from a single baculovirus vector (Fig. 7.22B). 
The proteins were found to assemble spontaneously and stably into virus¬ 
like particles. Although the envelope proteins were also expressed from 
three separate baculovirus vectors in a single host cell, stable virus-like 
particles were not recovered, possibly due to asynchronous expression of 
sufficient structural components. 


Mammalian Cell Expression Systems 

Mammalian cell expression systems are important for the production of 
heterologous proteins with a full complement of posttranslational modifi¬ 
cations. Currently, about half of the commercially available therapeutic 
proteins are produced in mammalian cells. However, there are several 
major challenges to the production of high levels of heterologous proteins 
in mammalian cell lines. Generally, these cells are slow growing, have more 
fastidious growth requirements, and can become contaminated with 
animal viruses. A number of established mammalian cell lines have been 
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developed as hosts for heterologous-protein production. Cells derived 
from African green monkey kidney (COS), baby hamster kidney (BHK), 
and human embryonic kidney (HEK 293) are used for short-term (tran¬ 
sient) gene expression for either rapid production of small amounts of 
heterologous proteins for evaluating their potential as drug candidates or 
testing the integrity of constructs during various stages of vector develop¬ 
ment. Important characteristics of these cells are their receptiveness to 
transfection, their ability to grow in suspension cultures without addition 
of serum that contains animal proteins, and their suitability for high-den- 
sity, large-scale production. The ability to grow in serum-free medium not 
only reduces costs, but also facilitates purification of the target protein and 
reduces the risk of contamination with animal-derived material. Chinese 
hamster ovary (CHO) cells and mouse myeloma cells are most commonly 
used for long-term (stable) gene expression and when high yields of heter¬ 
ologous proteins are required. About 140 recombinant proteins are cur¬ 
rently approved for therapeutic use, most produced in CHO cells that have 
been adapted for growth in high-density suspension cultures, and many 
more are in clinical trials. Although mammalian cells have been used for 
some time to produce therapeutic proteins, especially antibodies, and vec¬ 
tors carrying suitable expression signals have been developed, current 
efforts are aimed at improving productivity through the development of 
high-producing cell lines, increasing the stability of production over time, 
and increasing expression by manipulating the chromosomal environment 
in which the recombinant genes are integrated. 

Vector Design 

Most cloning vectors constructed for the expression of heterologous genes 
in mammalian cells are based on the genomes of viruses that infect mam¬ 
malian cells. The first vector was based on a simian virus (simian virus 40 
[SV40]) that can replicate in several mammalian species. The genome of 
this virus is a double-stranded DNA molecule of 5.2 kb that carries genes 
expressed early in the infection cycle that function in the replication of viral 
DNA (early genes) and genes expressed later in the infection cycle that 
function in the production of viral capsid proteins (late genes). For use as a 


FIGURE 7.22 (A) Virus-like particles are made up of viral coat (capsid) or envelope 
proteins that assemble to form a structure that resembles the original virus but does 
not contain the viral genetic material. (B) Viral spike (S), membrane (M), and enve¬ 
lope (E) proteins, which comprise the envelope of the human SARS-CoV, can be 
produced in insect cells using a single recombinant baculovirus vector carrying all 
three viral genes. Following expression, the S, M, and E proteins self-assemble to 
form a SARS-CoV virus-like particle that is a candidate vaccine for protection 
against SARS. Pp, polyhedrin promoter; lOp, baculovirus plO promoter. 
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FIGURE 7.23 Generalized mammalian expression vector. The multiple cloning site 
(MCS) and selectable marker gene (SMG) are under the control of eukaryotic pro¬ 
moter (p), polyadenylation (pa), and termination of transcription (TT) sequences. 
An intron (I) enhances the production of heterologous protein. Propagation of the 
vector in E. coli and mammalian cells depends on the origins of replication ori E and 
on' euk , respectively. The ampicillin resistance (Amp 1 ) gene is used for selecting trans¬ 
formed E. coli. 

cloning vector, some of the early and late genes are removed and replaced 
with a target gene under the control of appropriate mammalian expression 
signals. Although many cloning vectors are based on SV40 DNA, its use is 
restricted to small inserts because only a limited amount of DNA can be 
packaged into the viral capsid. Other vectors that can accommodate larger 
amounts of cloned DNA are derived from adenovirus; bovine papilloma¬ 
virus, which can be maintained as a multicopy plasmid in some mamma¬ 
lian cells; and adeno-associated virus, which can integrate into specific sites 
in the host chromosome. 

While hundreds of mammalian expression vectors have been devel¬ 
oped, they all tend to have the same shared features and are not very dif¬ 
ferent in design from other eukaryotic expression vectors. A representative 
mammalian expression vector (Fig. 7.23) contains a eukaryotic origin of 
replication, usually from an animal virus, such as SV40. The promoter 
sequences that drive expression of the cloned gene(s) and the selectable 
marker gene(s), and the transcription termination sequences (polyadenyla¬ 
tion signals), must be eukaryotic and are frequently taken from either 
human viruses (cytomegalovirus [CMV], SV40, or herpes simplex virus) or 
mammalian genes (p-actin, metallothionein, thymidine kinase, or bovine 
growth hormone). Strong constitutive promoters and efficient polyadeny¬ 
lation signals are preferred. Inducible promoters are often used when con¬ 
tinuous synthesis of the heterologous protein is toxic to the host cell. 
Expression of a gene of interest is often increased by placing the sequence 
for an intron between the promoter and the multiple cloning site within the 
transcribed region. The sequences that are required for selection and prop¬ 
agation of a mammalian expression vector in E. coli are derived from a 
standard E. coli cloning vector, such as pBR322. 

For the best results, a gene of interest must be equipped with transla¬ 
tion control sequences (Fig. 7.24). Initiation of translation in higher eukary¬ 
otic organisms depends on a specific sequence of nucleotides surrounding 
the start (AUG) codon in the mRNA called the Kozak sequence, e.g., 
GCCGCC(A or G)CCAUGG in vertebrates. The corresponding DNA 
sequence for the Kozak sequence, which is often followed by a signal 
sequence to facilitate secretion, a protein sequence (tag) to enhance the 
purification of the heterologous protein, and a proteolytic cleavage 
sequence that enables the tag to be removed from the heterologous protein. 
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is placed at the 5' end of the gene of interest. A stop codon is added to 
ensure that translation ceases at the correct location. Finally, the sequence 
content of the 5' and 3' untranslated regions (UTRs) is important for effi¬ 
cient translation and mRNA stability. Either synthetic 5' and 3' UTRs or 
those from the human (3-globin gene are used in mammalian expression 
vectors. The codon content of the gene of interest may also require modifi¬ 
cation to suit the translational preferences of the host cell. 

The majority of mammalian cell expression vectors carry a single gene 
of interest that encodes a functional polypeptide. However, the active form 
of some commercially important proteins consists of two different protein 
chains. For example, human thyroid-stimulating hormone is a two-chain 
protein (heterodimer), and both hemoglobin and antibodies are tetramers 
with two copies of each subunit, a 2 (3 2 and H 2 L 2 , respectively. It is possible 
to clone the gene or cDNA for each subunit of a multimeric protein, synthe¬ 
size and purify each subunit separately, and then mix the chains together 
in a test tube. Unfortunately, relatively few multichain proteins are prop¬ 
erly assembled in vitro. By contrast, in vivo assembly of dimeric and tetra- 
meric proteins is quite efficient. Consequently, various strategies have been 
devised for the production of two different recombinant proteins within 
the same cell. 

Two mammalian expression vectors, each with the gene or cDNA for 
one of the subunits and a different selectable marker gene, can be cotrans¬ 
fected into host cells (Fig. 7.25). The transfected cells are treated with both 
selecting agents, and the cells that survive carry both vectors. Two-vector 
systems have been used successfully for the production of authentic 
dimeric and tetrameric recombinant proteins. However, loss of one of the 
two vectors in doubly transfected cells is common. Moreover, the two vec¬ 
tors are not always maintained with the same copy number, so one sub¬ 
unit is overproduced relative to the other and yields of the final product 
are reduced. To overcome these problems, single vectors that carry two 
cloned genes have been developed. In some instances, the two genes are 
placed under the control of independent promoters and polyadenylation 
signals (double-cassette vectors) (Fig. 7.26). Alternatively, to ensure that 
equal amounts of the recombinant proteins are synthesized, vectors (bicis- 
tronic vectors) have been constructed with the two cloned genes separated 
from each other by a DNA sequence that contains an internal ribosomal 
entry site (IRES) (Fig. 7.27). Parenthetically, IRESs are found in mamma¬ 
lian virus genomes, and after transcription, they allow simultaneous 
translation of different proteins from a polycistronic mRNA molecule. 
Transcription of a "gene a-IRES-gene (3" construct is controlled by one 
promoter and polyadenylation signal. Under these conditions, a single 
"two-gene" (bicistronic) transcript is synthesized, and translation pro¬ 
ceeds from the 5' end of the mRNA to produce one of the chains (chain a) 


FIGURE 7.24 Translation control elements. A gene of interest can be fitted with var¬ 
ious sequences that enhance translation and facilitate both secretion and purifica¬ 
tion, such as a Kozak sequence (K), signal sequence (S), protein affinity tag (T), 
proteolytic cleavage site (P), and stop codon (SC). The 5' and 3' UTRs increase the 
efficiency of translation and contribute to mRNA stability. 
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FIGURE 7.25 Two-vector expression system. The cloned genes (gene a and gene (3) 
encode subunits of a protein dimer (a(3). After cotransfection, both subunits (a and 
P) are synthesized and assembled into a functional protein dimer. Both vectors 
carry origins of replication for E. coli ( ori E ) and mammalian cells (on' euk ) and a 
marker (Amp r ) gene for selecting transformed E. coli. The selectable marker genes 
(SMG1 and SMG2) and the cloned genes (gene a and gene p) are each under the 
control of promoter (p), polyadenylation (pa), and termination of transcription (TT) 
sequences. 


and internally from the IRES element to produce the second chain (chain 
(3) (Fig. 7.27). Generally, constructing an effective mammalian expression 
vector is time-consuming and demands considerable effort to achieve 
optimum protein production. 

Baculovirus Vectors for Protein Production in Mammalian Cells 

It is possible to use some of the baculovirus delivery systems that have 
already been developed to express target proteins in mammalian cells, such 
as HEK 293, CHO, COS, and human HeLa cells. Although baculoviruses 
cannot replicate in mammalian cells, they can be transduced into these cells 
with a transduction efficiency reaching 100% in some cases, where they 
enter the nucleus and express heterologous genes that were inserted in the 
viral genome. Both adherent cells growing in a single layer on a solid sur¬ 
face and cells suspended in culture can be transduced. Because baculovi¬ 
ruses cannot replicate in mammalian cells and the polyhedron-deficient 
strains employed as vectors cannot infect insects, this system presents a 
safe alternative to some of the other approaches to heterologous-protein 
expression in mammalian cells. 
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FIGURE 7.26 Two-gene expression vector. The cloned genes (gene a and gene (3) 
encode subunits of a protein dimer (ap). The cloned genes are inserted into a vector 
and are under the control of different eukaryotic promoter (p), polyadenylation (pa), 
and termination of transcription (TT) sequences. Each subunit is translated from a 
separate mRNA, and a functional protein dimer (aP) is assembled. The vector has 
origins of replication for E. coli (ori E ) and mammalian cells (on euk ), a marker gene 
(Amp r ) for selecting transformed E. coli, and a selectable marker gene (SMG) that is 
under the control of eukaryotic promoter (p), polyadenylation (pa), and termination 
of transcription (TT) sequences. 


Appropriate promoter, polyadenylation, and transcription termina¬ 
tion sequences that are functional in mammalian cells must be included 
on the recombinant baculovirus delivery system, as they are for other 
mammalian expression systems. First, a recombinant AcMNPV vector is 
constructed to carry the target gene under the control of mammalian cell 
expression signals, using methods such as site-specific recombination 
(shown in Fig. 7.20). Next, the titer of the recombinant virus is increased 
by replication in insect cell lines. Finally, purified recombinant AcMNPV 
DNA is transduced into mammalian cells. The mechanism of baculovirus 
uptake by mammalian cells is unclear but seems to involve specific pro¬ 
tein interactions. The baculovirus envelope glycoprotein 64 (gp64) may 
be important, because increasing the levels of this protein, for example, 
by including additional copies of the gp64 gene in the AcMNPV vector, 
results in increased transduction. Transduction efficiency and, subse¬ 
quently, target gene expression, are often variable but can be optimized 
by incubating the baculovirus and mammalian cells for a longer time and 
at a lower temperature, by adding the viral inoculum over a period of 
several days, or by addition of the chemical butyrate or trichostatin A, 
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FIGURE 7.27 Bicistronic expression vector. The cloned genes (gene a and gene P) 
encode subunits of a protein dimer (cx(3). Each cloned gene is inserted into a vector 
on either side of a sequence for an IRES. The two genes and the IRES sequence form 
a transcription unit under the control of a single eukaryotic promoter (p), polyade- 
nylation (pa), and termination of transcription (TT) sequence. Translation of the 
mRNA occurs from the 5' end and internally (right-angled arrows). Both subunits 
(a and P) are synthesized and assembled into a functional protein dimer (ap). The 
vector carries origins of replication for E. coli (ori E ) and mammalian cells (on euk ), a 
selectable marker (Amp r ) for selecting transformed E. coli, and a selectable marker 
gene (SMG) that is under the control of eukaryotic promoter (p), polyadenylation 
(pa), and termination of transcription (TT) sequences. 


which prevents modification of chromatin structure that can inhibit gene 
expression. 

Expression of a heterologous gene from the baculovirus vector in 
mammalian host cells is typically transient, although it can be sustained 
for several days to weeks. For stable long-term expression, the target 
gene is integrated into the host genome. Stable cell lines can be generated 
by including a selectable marker, such as the gene encoding neomycin 
phosphotransferase (see below) under the control of an appropriate 
mammalian promoter, on the AcMNPV vector. Under selective condi¬ 
tions, fragments of baculovirus DNA ranging in size from 5 to 18 kb and 
containing the target gene were found to be randomly integrated into the 
mammalian host cell genome, albeit at a relatively low frequency. 
Integration into host DNA using baculoviruses can be increased signifi¬ 
cantly by engineering the AcMNPV vector to carry the adeno-associated 
virus inverted terminal repeat regions (145 base pairs) flanking the target 
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gene (Fig. 7.28). Similar to the attachment (att) sequences used by bacte¬ 
riophage A to integrate into specific sites in the bacterial genome 
(described above), the inverted terminal repeat sequences are used by 
adeno-associated virus for integration into the host genome. The rep gene 
must also be included, as the encoded protein directs site-specific inte¬ 
gration. 

Selectable Markers for Mammalian Expression Vectors 

For the most part, the systems that are used to select transfected mamma¬ 
lian cells are the same as those for other eukaryotic host cells (Table 7.7). In 
fact, a number of bacterial marker genes have been adapted for eukaryotic 
cells. For example, the bacterial neo gene, which encodes neomycin phos¬ 
photransferase, is often used to select transfected mammalian cells. 
Flowever, in eukaryotic cells, G-418 (Geneticin), which is phosphorylated 
by neomycin phosphotransferase, replaces neomycin as the selective agent 
because neomycin is not an effective inhibitor of eukaryotic protein syn¬ 
thesis. 

Some selection schemes are designed not only to identify transfected 
cells, but to increase heterologous-protein production by amplifying the 
copy number of the expression vector (Fig. 7.29). The dihydrofolate 
reductase-methotrexate system falls into this category. Briefly, dihydrofo¬ 
late reductase catalyzes the reduction of dihydrofolate to tetrahydrofolate, 
which is required for the production of purines. Methotrexate is a com¬ 
petitive inhibitor of dihydrofolate reductase. Sensitivity to methotrexate 
can be overcome if the cell produces excess dihydrofolate reductase, and as 
the methotrexate concentration is increased over a period of time, the dihy¬ 
drofolate reductase gene in cultured cells is amplified. It is not unusual for 
methotrexate-resistant cells to have hundreds of dihydrofolate reductase 
genes. The standard dihydrofolate reductase-methotrexate protocol entails 
transfecting dihydrofolate reductase-deficient cells with an expression 
vector carrying a dihydrofolate reductase gene as the selectable marker 
gene and treating the cells with methotrexate. After the initial selection of 
transfected cells, the concentration of methotrexate is gradually increased, 
and eventually cells with very high copy numbers of the expression vector 
are selected. 


FIGURE 7.28 Baculovirus vector for stable expression of target genes in mammalian 
cells. The target gene with appropriate promoter (p) and terminator (t) sequences is 
inserted into an AcMNPV vector between sequences for adeno-associated virus 
inverted terminal repeats (ITR). Following the transfection of mammalian cells, the 
protein encoded by the rep gene from adeno-associated virus mediates site-directed 
integration of the target gene into the host genome via the ITR sequences. 
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TABLE 7.7 Selective marker gene systems for mammalian cells 


Selective agent 

Action of selective agent 

Marker gene 

Action of marker gene protein 

Xyl-A 

Damages DNA 

Adenine deaminase (ada) 

Deaminates Xyl-A 

Blasticidin S 

Inhibits protein synthesis 

Blasticidin S deaminases (Bsr, 
BSD) 

Deaminates blasticidin S 

Bleomycin 

Breaks DNA strands 

Bleomycin-binding protein (Ble) 

Binds to bleomycin 

G-418 (Geneticin) 

Inhibits protein synthesis 

Neomycin phosphotransferase 
(neo) 

Phosphorylates G-418 

Histidinol 

Produces cytotoxic effects 

Histidinol dehydrogenase (hisD) 

Oxidizes histidinol to histidine 

Hygromycin B 

Inhibits protein synthesis 

Hygromycin B phosphotrans¬ 
ferase (Hph) 

Phosphorylates hygromycin B 

MSX 

Inhibits glutamine synthesis 

Glutamine synthetase (GS) 

Cells that produce excess glutamine 
synthetase survive. 

MTX 

Inhibits DNA synthesis 

Dihydrofolate reductase (dhfr) 

Cells that produce excess dihydro¬ 
folate reductase survive. 

PALA 

Inhibits purine synthesis 

Cytosine deaminase (codA) 

Lowers cytosine levels in the 
medium by converting cytosine 
to uracil 

Puromycin 

Inhibits protein synthesis 

Puromycin N-acetyltransferase 
(Pac) 

Acetylates puromycin 


MSX, methionine sulfoximine; MTX, methotrexate; PALA, N-(phosphoacetyl)-L-aspartate; Xyl-A, 9-(3-D-xylofuranosyl adenine. 


Engineering Mammalian Cell Hosts for Enhanced Productivity 

Several improvements have been made to mammalian cell lines to enhance 
their productivity by increasing cell growth, vector stability, gene expres¬ 
sion, and protein secretion. Conditions in large-scale bioreactors can be 
stressful for mammalian cells. Depleted nutrients and accumulation of 
toxic cell waste can limit the viability and density of cells as they respond 
to stress by inducing cell death, also known as apoptosis. Often chemical 
inhibitors of cell death pathways are utilized, but recently, genetic means 
have been explored to construct cell lines in which the apoptotic pathways 
are inhibited. When cells perceive a variety of stresses, an initial response 
is the activation of the tumor suppressor protein p53, which is a transcrip¬ 
tion factor that induces the expression of genes that encode proteins in the 
apoptotic pathway. One method to improve cell growth and viability under 
culture conditions in bioreactors is to prevent p53 from activating the cell 
death response pathway. The mouse double-mutant 2 protein (MDM2) 
binds to protein p53 and prevents it from acting as a transcription factor 
(Fig. 7.30). MDM2 also marks p53 for degradation. HEK 293 and CHO cells 
were transfected with plasmids containing a regulatable MDM2 gene and 
cultured under conditions that mimicked the late stages of cell culture and 
in nutrient-limited medium. Cultures expressing MDM2 had higher cell 
densities and delayed cell death compared to nontransfected cells, espe¬ 
cially in nutrient-deprived medium. 

Many cultured mammalian cells are unable to achieve high cell densi¬ 
ties in cultures because toxic metabolic products accumulate in the culture 
medium and inhibit cell growth. Although efforts are made to optimize the 
culture conditions, inevitably nutrients essential for optimal cell growth, 
including oxygen, are reduced. Under low-oxygen conditions, many cells. 
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FIGURE 7.29 Schematic representation of the process of selecting for an increased 
copy number of a gene in cultured cells. As the concentration of an inhibitor of a 
vital enzyme ([x]) is increased, cells with extra copies of the gene that encodes this 
enzyme survive. The increments of the gene copy number occur by chance among 
thousands of cells. The circles indicate cells; the numerals indicate numbers of gene 
copies. 


including CHO cells, secrete the acidic waste product lactate as they 
struggle to obtain energy from glucose. Under these conditions, pyruvate, 
an intermediate compound produced during the metabolism of glucose, is 
converted to lactate by lactate dehydrogenase rather than entering into the 
tricarboxylic acid cycle, where it is further oxidized through the activity of 
pyruvate carboxylase (Fig. 7.31). Pyruvate carboxylase has a lower level of 
activity in the absence of oxygen. To counteract the acidification of the 
medium from lactate secretion, alkaline compounds are typically added; 
however, they also contribute to reduced cell growth by increasing the 
osmolality of the medium. A more effective approach may be to either 
decrease the expression of lactate dehydrogenase or increase the expression 
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FIGURE 7.30 Strategy to increase yields of recombinant mammalian cells. Cell death 
(apoptosis), stimulated by the transcription factor p53, can lead to decreased yields 
of recombinant mammalian cells grown under stressful conditions in large bioreac¬ 
tors. To prevent cell death, the gene encoding MDM2 is introduced into mammalian 
cells. The MDM2 protein binds to p53 and prevents it from inducing expression of 
proteins required for apoptosis. Engineered cells not only showed delayed cell 
death, but also achieved higher cell densities in bioreactors. 


and/or the activity of pyruvate carboxylase in host cells. The human pyru¬ 
vate decarboxylase gene was cloned into an expression vector under the 
control of the CMV promoter and the SV40 polyadenylation signals and 
transfected into CHO cells. Under selective conditions, the pyruvate car¬ 
boxylase gene was stably integrated into the CHO genome and expressed, 
and the enzyme was detected in the mitochondria, where glucose is 
degraded. After 7 days in culture, 85% of the CHO cells without the human 
pyruvate carboxylase gene were viable. In contrast, 96% of the cells that 
expressed human pyruvate carboxylase were still viable. The rate of lactate 
production decreased by up to 40% in the engineered cells. 

Many of the eukaryotic DNA viruses from which the vectors used in 
mammalian cells are derived maintain their genomes as multicopy epi- 
somal DNA (plasmids) in the host cell nucleus. These viruses produce 
proteins, such as the large-T antigen in SV40 and the nuclear antigen 1 
protein in Epstein-Barr virus, that help to maintain the plasmids in the host 
nucleus and to ensure that each host cell produced after cell division 
receives a copy of the plasmid. To increase the copy number of the target 
gene by increasing the plasmid copy number, HEK 293 cells and other cell 
lines have been engineered to express the SV40 large-T antigen or Epstein- 
Barr nuclear antigen 1. 

Many heterologous proteins of therapeutic value, such as antibodies 
and interferon, are secreted. However, the high levels of these proteins that 
are desirable from a commercial standpoint can quickly overwhelm the 
capacity of the cell secretory system. Thus, protein processing is a major 
limiting step in the achievement of high target protein yields. Although 
high levels of recombinant protein production have been found to increase 
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FIGURE 7.31 When oxygen is present, 
pyruvate, which is formed from glu¬ 
cose during glycolysis, is converted by 
the enzyme pyruvate carboxylase to an 
intermediate compound in the tricar¬ 
boxylic acid (TCA) cycle. This meta¬ 
bolic pathway is important for the 
generation of cellular energy and for 
the synthesis of biomolecules required 
for cell proliferation. However, under 
low-oxygen conditions, such as those 
found in large bioreactors, pyruvate 
carboxylase has a low level of activity. 
Under these conditions, lactate dehy¬ 
drogenase converts pyruvate into lac¬ 
tate, which yields a lower level of 
energy. Cultured cells secrete lactate, 
thereby acidifying the medium. 


the levels of proteins associated with proper protein folding and secretion 
in the endoplasmic reticulum, the levels are usually not sufficient for 
optimal protein processing. Researchers have therefore devised methods to 
increase the capacity for protein secretion by engineering cell lines with 
enhanced production of components of the secretion apparatus. Although 
this can be achieved in yeast and insects by transforming host cells with 
additional copies of the endoplasmic reticulum chaperone proteins BiP and 
protein disulfide isomerase (described above), they are not very effective in 
mammalian host cells, possibly because these engineered cell lines overex¬ 
press only a single component of the secretion pathway. A more effective 
strategy may be to simultaneously overexpress several, if not all, of the 
proteins that make up the secretory mechanism. Simultaneous upregula- 
tion of the genes encoding these proteins can be achieved through the 
enhanced production of the transcription factor X box protein 1 (Xbp-1), a 
key regulator of the secretory pathway. Normally, full-length, unspliced 
xbp-1 mRNA is found in nonstressed cells and is not translated into a stable, 
functional protein (Fig. 7.32A). However, when unfolded or misfolded pro¬ 
teins accumulate in the endoplasmic reticulum, a ribonuclease is activated 
that specifically cleaves xbp-1 mRNA (Fig. 7.32B). This results in the pro¬ 
duction of a functional transcription factor that activates the expression of 
a number of proteins of the secretion apparatus. A truncated xbp-1 gene 
that encodes an actively translated form of xbp-1 mRNA (Fig. 7.32C) was 
overexpressed under the control of the CMV promoter in recombinant 
CHO cell lines that were previously constructed to express human erythro¬ 
poietin, human y-interferon, and human monoclonal antibodies either 
stably or transiently. Expression of the genes encoding proteins of the secre¬ 
tion apparatus that are controlled by Xbp-1 was found to increase in 
response to elevated levels of Xbp-1. Although overexpression of Xbp-1 did 
not increase the production of recombinant proteins in stable cell lines, a 
significant increase was observed in cell lines engineered to express the 
target proteins transiently. 

Plasmid Integration and Chromosomal Environment 

A major consideration for high levels and long-term stability of heterolo¬ 
gous-protein production is the site of integration of the gene of interest into 
the mammalian cell chromosome. Expression of high levels of protein from 
plasmid vectors is transient and inevitably results in loss of the vector, 
which cannot be propagated in mammalian cells, or death of the host cell. 
Stable cell lines in which the target gene is integrated into the chromosome 
have been generated to overcome this problem. However, the site of inte¬ 
gration can have a significant impact on the levels of target protein pro¬ 
duced. Genomic DNA is associated with a great number of proteins, 
including the major histone proteins, around which the DNA is coiled, that 
compact (condense) the DNA so that it can fit inside the nucleus. The DNA 
and associated packaging proteins are known as chromatin. While much of 
the genome is highly condensed (heterochromatin) and contains silent 
genes or genes with low levels of expression, other regions are less con¬ 
densed (euchromatin) and contain actively transcribed genes. For enhanced 
expression and stability, the target gene should be integrated into euchro- 
matin, rather than heterochromatin. Because a larger portion of the genome 
is in the heterochromatin form, there is a greater chance that the target gene 
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FIGURE 7.32 Strategy to increase yields of secreted recombinant proteins from mam¬ 
malian cells by simultaneously upregulating the expression of several proteins in 
the secretion apparatus. The expression of chaperones and other proteins of the 
secretion apparatus is controlled by the transcription factor Xbp-1. (A) In unstressed 
cells, the intron (green box) is not cleaved from the xbp-1 transcript, and therefore, 
functional Xbp-1 transcription factor is not produced. (B) However, in stressed cells 
that have accumulated misfolded proteins, an endoribonuclease cleaves the tran¬ 
script to yield mature xbp-1 mRNA (the red and blue boxes represent exons) that is 
translated into a stable, functional transcription factor. (C) Recombinant CHO cells 
were transfected with a truncated gene including only the xbp-1 exons and overpro¬ 
duced a functional Xbp-1 transcription factor that directed the production of high 
levels of proteins required for protein secretion. 


will be inserted into one of these regions. Therefore, genetic engineering 
strategies to prevent the surrounding DNA from decreasing transcription 
of inserted genes are being explored. 

These strategies exploit natural cellular processes (known as epigenetic 
modifications) that contribute to the dynamic state of chromatin. Chromatin 
structure, that is, the degree of DNA packing, is altered in two general 
ways. Histone proteins are modified by addition of chemical groups, such 
as an acetyl group, to specific amino acids, and cytosines at specific sites in 
the chromosomal DNA can be methylated. Techniques to relax chromatin 
structure and thereby increase the expression of introduced genes include 
modifying host strains to express proteins that alter chromatin structure at 
the site of vector integration or inserting DNA elements that prevent chro¬ 
mosome condensation together with the target gene. 

One approach to alter the epigenetic environment surrounding the 
inserted gene is to increase histone acetylation. The extent of histone 
acetylation is determined by the relative activities of two host cell enzymes. 
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histone acetyltransferase, which adds acetyl groups to lysine residues on 
histone proteins, and histone deacetylase, which removes acetyl groups 
from the histone. The relative influences of these two enzymes at a given 
promoter are determined by specific transcription factors that recruit one 
or the other of the enzymes to the promoter. Increased histone acetylation, 
which leads to increased gene transcription, can be accomplished either by 
increasing the expression of histone acetyltransferase or by decreasing the 
activity of histone deacetylase. Although chemical inhibitors of histone 
deacetylation, such as butyrate, are often added to the culture medium, 
they have a broad, genomewide effect on cellular histones and may inhibit 
cell growth and stimulate apoptosis. A better strategy is to target histone 
acetyltransferase specifically to the site of target gene insertion to ensure 
that the target gene is actively and continuously transcribed. One group of 
researchers created a stable CHO cell line in which histone acetyltrans¬ 
ferase was produced as a fusion protein with the LexA protein that binds 
to specific DNA sequences (Fig. 7.33). To test this fusion protein for the 
ability to specifically increase expression of integrated target genes, the 
green fluorescent protein (GFP) reporter gene was employed as a target 
gene and was integrated into a CHO chromosome under the control of the 
CMV promoter with the LexA-binding sequence inserted upstream. A 
gene encoding resistance to the antibiotic Zeocin (a member of the bleo¬ 
mycin/phleomycin family of antibiotics) was coupled to the reporter gene 
by an IRES element and therefore was also under the control of the CMV 
promoter. Stable cells with an active CMV promoter were established by 
addition of Zeocin to the culture medium. Production of GFP, determined 
by measuring the emission of green fluorescence in a spectrophotometer, 
was severalfold higher in cells that expressed the LexA-histone acetyl¬ 
transferase fusion protein than in those that expressed the LexA protein 
alone (Fig. 7.33A). The LexA protein specifically binds to the LexA recog¬ 
nition site upstream of the gene encoding GLP and brings with it the fused 
histone acetyltransferase protein that acetylates histones in the promoter 
region and promotes a higher level of GLP transcription. Moreover, 
expression remained stable, although at a lower level, for at least 4 months 
in some of the clones. 

To improve expression levels over a longer period, the construct was 
further modified to include a DNA segment known as a stabilizing and 
antirepressor (STAR) element on both sides of the expression cassette to 
block repression (Pig. 7.33B). Repression can occur when heterochromatin 
forms due to the association of the heterochromatin protein HP1 with 
methylated histones. This stimulates further histone deacetylation and 
methylation and, consequently, greater HP1 activity. Insertion of the rela¬ 
tively small (<2-kb) STAR elements was found to counteract the activity of 
HP1 and other heterochromatin-associated repressor proteins. Planking 
the expression cassette with the antirepressor elements resulted in higher 
levels of GPP expression that were maintained over a longer period of 
time. 

Other DNA elements that improve heterologous-protein expression 
by modifying heterochromatin structure are the ubiquitous chromatin¬ 
opening elements and matrix-associated regions. Ubiquitous chromatin¬ 
opening elements are sequences of DNA normally found near the 
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FIGURE 7.33 Strategies to increase expression of recombinant proteins in mammalian 
cells by altering chromatin structure. Local "relaxation" of chromosome condensa¬ 
tion, which leads to increased transcription of genes in the region, can be achieved 
by the addition of an acetyl group to DNA-packing proteins known as histones. 
Histone acetylation is catalyzed by the enzyme histone acetyltransferase (HAT). (A) 
To increase the expression of a recombinant protein, HAT was directed to the site of 
target gene (GFP gene) insertion in a mammalian chromosome. HAT was expressed 
as a fusion protein with the LexA protein that binds to a specific DNA sequence 
(LexA-BS) inserted upstream of the CMV promoter (P CMV ) that directs expression of 
GFP. Production of the HAT-LexA fusion protein under the control of the SV40 
promoter (P SV40 ) increased expression of GFP sixfold compared to production of the 
LexA protein alone. (B) Insertion of STAR elements on both sides of the expression 
cassette further increased GFP expression. The gene encoding resistance to the anti¬ 
biotic Zeocin was included as a selectable marker and was expressed from an IRES. 
The arrows above the promoter boxes indicate the direction of transcription. 


promoters of housekeeping genes that are constitutively expressed at high 
levels due to enhanced histone acetylation. Inclusion of the ubiquitous 
chromatin-opening element from the promoter of the highly expressed 
CHO elongation factor 1 alpha gene in an expression vector increased 
recombinant protein expression in CHO cells 6- to 35-fold. Matrix- 
associated regions were also found to enhance the production of heterolo¬ 
gous protein in CHO cells. These elements, found in the chromosomes of 
many eukaryotes, bind to protein complexes in the nucleus that arrange 
regions of the chromosome into loops. It is thought that these DNA loops 
contain transcriptionally active genes that are regulated in a coordinated 
fashion. Although matrix-associated regions from the human (3-globin 
gene and the chicken lysozyme gene were found to increase expression of 
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a target gene, not all matrix-associated regions have a positive effect on 
gene expression. 

In sum, mammalian cell expression systems are as versatile and effec¬ 
tive as other eukaryotic expression systems. However, industrial produc¬ 
tion of a recombinant protein with engineered mammalian cells is costly. 
Consequently, less expensive expression systems are favored unless 
authenticity of an important recombinant protein can be obtained only 
with mammalian cells. 


SUMMARY 


A number of heterologous proteins have been successfully 
synthesized in prokaryotic host cells. However, many 
proteins require eukaryote-specific posttranslational modifi¬ 
cations, such as glycosylation, to be functional. Consequently, 
expression systems were devised for fungal, insect, and mam¬ 
malian cells. With respect to the ease and likelihood of 
obtaining an authentic protein from a cloned gene, each of 
these systems has distinct merits and shortcomings. In other 
words, there is no single eukaryotic host cell that is capable of 
producing an authentic protein from every cloned gene. 

All eukaryotic expression vectors have the same basic 
format. The gene of interest, which may be equipped with 
sequences that facilitate the secretion and purification of the 
heterologous protein, is under the control of eukaryotic pro¬ 
moter and polyadenylation and transcription terminator 
sequences. To simplify both maintenance and recombinant 
DNA manipulations, eukaryotic expression vectors are rou¬ 
tinely maintained in E. coli. 

Several different fungus-based expression systems have 
been developed for the production of heterologous proteins. 
The yeast S. cerevisiae, which is well characterized genetically 
and can be grown in large fermenters, has been used exten¬ 
sively for this purpose. Both episomal and integrating expres¬ 
sion vectors have been constructed. However, with S. cerevisiae 
as the host cell, a number of recombinant proteins are hyper- 
glycosylated, and in some cases, protein yields are low 
because the capacity of the cell to properly fold and secrete 
proteins has been exceeded. Other yeast and filamentous 
fungal systems have been developed for the production of 
heterologous proteins. Of these, the methylotrophic yeast P. 
pastoris has been used successfully because of the low occur¬ 
rence of hyperglycosylation, the ease of obtaining high cell 
densities, and the rapid and strong response of the AOX1 pro¬ 
moter (usually used to drive the gene of interest) to methanol. 
A "humanized" strain of P. pastoris has been genetically 
altered to produce glycoproteins with glycosylation patterns 
that are identical to those found on the same proteins pro¬ 
duced in human cells. 

A large number of biologically active heterologous pro¬ 
teins have also been produced in insect cells grown in culture 
using baculoviruses to deliver the gene of interest into the 
insect host cell. This system is advantageous because post¬ 


translational protein modification is similar in insects and 
mammals, and the baculoviruses used in these systems do 
not infect humans or other insect cells. The baculovirus most 
commonly used as a vector is AcMNPV. A gene of interest is 
inserted into the AcMNPV genome by homologous or site- 
specific recombination between sequences on a transfer 
vector carrying the target gene and the AcMNPV DNA. 
Recombination occurs either in insect cells doubly transfected 
with the transfer vector and viral DNA, in E. coli as an inter¬ 
mediate host, or in an in vitro reaction catalyzed by purified 
integration enzymes. The last two methods eliminate the 
need to identify and purify recombinant baculoviruses using 
plaque assays. Once the target gene has been inserted, recom¬ 
binant AcMNPV DNA is introduced into insect cells for het¬ 
erologous-protein production. Improved insect host cells 
have been developed through genetic engineering to increase 
protein yields and to ensure that target proteins are properly 
glycosylated. In addition to production of a single protein of 
interest, the baculovirus-insect expression system is particu¬ 
larly amenable to producing functional multimeric protein 
complexes, such as virus-like particles, which are effective 
vaccines. 

Many therapeutic proteins that require a full complement 
of posttranslational modifications are now produced in cul¬ 
tured mammalian cells, such as CHO cells. Most of the vec¬ 
tors that have been developed to introduce foreign genes into 
mammalian cells are based on mammalian viruses, especially 
SV40. The viral genome has been altered to remove some 
viral genes required for replication and viral-protein produc¬ 
tion and to include suitable mammalian transcription and 
translation signals to drive expression of the cloned gene. 
Baculoviruses have also been used to deliver target genes into 
mammalian cells, although expression of the target gene is 
usually transient, unless the gene is integrated into the host 
cell genome to generate a stable cell line. Expression of inte¬ 
grated target genes can be increased by altering the epige¬ 
netic state of the insertion site through histone acetylation or 
insertion of chromatin-relaxing DNA elements. A major chal¬ 
lenge for production of high levels of heterologous proteins 
in mammalian cell lines is preventing cell death, which is 
often induced by the stressful conditions of large-scale biore¬ 
actors. Strategies to improve cell growth and protein yields 
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include genetically engineering host cells to block the tran¬ 
scription factor that induces apoptosis, to prevent accumula¬ 
tion of toxic metabolites in the culture medium, and to 


increase expression of proteins required for proper protein 
folding and secretion. 
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REVIEW QUESTIONS 


1. What are the major posttranslational modifications of 
eukaryotic proteins in the endoplasmic reticulum and Golgi 
apparatus? 

2. In general, how do N glycosylation patterns differ among 
yeast, insects, and mammalian proteins? 

3. Describe the features of a eukaryotic expression vector. 

4. What are the advantages and disadvantages of the different 
classes of yeast vectors for producing a biotechnology 
product? 

5. Describe some of the strategies that have been used to 
increase proper folding and secretion of recombinant proteins 
from yeast cells. 


6. Discuss the salient features of a P. pastoris high-expression 
integrating vector system. How has P. pastoris been "human¬ 
ized"? 

7. What are baculoviruses? Describe the strategy of the orig¬ 
inal baculovirus expression system and how this protocol has 
been enhanced. 

8. Describe two strategies that can be used to insert a target 
gene into the baculovirus genome that do not require transfec¬ 
tion of insect cells. 

9. Describe the main features of an extrachromosomal mam¬ 
malian-cell expression vector. 
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10. Describe at least two selectable marker systems that are 
used with mammalian expression vectors. 

11. How does a stable mammalian cell line differ from a tran¬ 
sient cell line? Describe one way in which a stable cell line can 
be generated. 

12. Why are yields of recombinant proteins produced by 
mammalian cells in large bioreactors generally low? How can 
yields be improved? 


13. What is chromatin, and how does it affect gene expres¬ 
sion? Describe some of the strategies that have been devel¬ 
oped to increase expression levels of a target gene that is 
integrated into a chromosome. 

14. What criteria are used to decide if a particular recombi¬ 
nant protein should be produced in a yeast, insect, or mam¬ 
malian cell system? 
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Directed Mutagenesis and 
Protein Engineering 


I T is possible with recombinant DNA technology to isolate the gene (or 
complementary DNA [cDNA]) for any protein that exists in nature, to 
express it in a specific host organism, and to produce a purified product 
that can be used commercially However, the physical and chemical proper¬ 
ties of these "naturally occurring" proteins are often not well suited to an 
industrial application. In some instances, a protein that is better suited to a 
particular task may be obtained by using a gene from an organism that 
grows in an unusual, often extreme, environment. For example, when an 
enzyme, such as a-amylase, was required to function at a high tempera¬ 
ture, the gene for that enzyme was isolated from Bacillus stearothermophilus, 
an organism whose natural niche is a 90°C hot spring. In this case, the 
cloned gene produced a-amylase that was stable at the high temperatures 
used in the industrial production of alcohol from starch. In addition to 
isolating natural genes that encode proteins with useful properties, con¬ 
ventional mutagenesis and selection schemes can be used in an attempt to 
create and perpetuate a mutant form of a gene that encodes a protein with 
the desired properties. However, the number of mutant proteins (each with 
a different amino acid change) that are possible after the alteration of indi¬ 
vidual nucleotides within a structural gene by conventional mutagenesis is 
extremely large. In practice, the mutagenesis-selection strategy rarely 
results in any significant beneficial changes to the targeted protein because 
most amino acid changes decrease the activity of an enzyme. 

By using a set of techniques that change specific amino acids encoded 
by a cloned gene, proteins with properties that are better suited than those 
of naturally occurring counterparts can be created for therapeutic and 
industrial applications. Such "directed evolution" of genes encoding pro¬ 
teins of interest has emerged as a key technology to generate proteins with 
new and improved properties. For example: 

• By altering both the Michaelis constant (K m ), which reflects the 
"tightness" of substrate binding to the enzyme, and the maximal 
rate of conversion of the substrate into product under defined condi¬ 
tions (V max ) for an enzyme-catalyzed reaction to improve the overall 
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catalytic efficiency (V max /KJ of the reaction, where V max equals the 
total amount of enzyme present (E 0 ) times the catalytic rate constant 

(fccat) 

• By changing the thermal tolerance, pH stability, or both of a protein, 
enabling the altered protein to be used under conditions that would 
inactivate the native version 

• By modifying the reactivity of an enzyme in nonaqueous solvents so 
that chemical reactions can be catalyzed under nonphysiological 
conditions 

• By changing an enzyme so that a cofactor is no longer required for 
certain continuous industrial production processes in which the 
cofactor must be supplied on a regular basis 

• By modifying the substrate-binding site of an enzyme to increase its 
specificity, thereby decreasing the extent of undesirable side reac¬ 
tions 

• By increasing the resistance of a protein to cellular proteases, which 
simplifies purification and increases the recoverable yield 

• By altering the allosteric regulation of an enzyme to diminish the 
impact of metabolite feedback inhibition and increase the product 
yield 


Directed Mutagenesis Procedures 

It is not a simple matter to produce a new protein with specified predeter¬ 
mined properties. However, it is quite feasible to modify the existing prop¬ 
erties of known proteins. Theoretically, these changes can be carried out at 
either the protein or the gene level. However, chemical modifications of 
proteins generally are harsh, nonspecific, and required repeatedly, for each 
batch of protein, so it is preferable to manipulate the DNA sequence of a 
cloned gene to create an altered protein with novel properties. Unfortunately, 
it is not always possible to know in advance which individual amino acids 
or short sequences of amino acids contribute to a particular physical, 
kinetic, or chemical property. For example, a particular property of a pro¬ 
tein may be the consequence of two or more amino acid residues that are 
far apart from each other in the linear sequence but are juxtaposed as a 
result of the folding of the protein. In this case, two or more amino acid 
residues may have to be changed to produce a protein with the desired 
properties. In the not too distant future, computer programs may be able to 
make accurate predictions of protein function on the basis of deduced 
amino acid sequences, thereby simplifying the task of producing a protein 
with specific predetermined properties. At present, although it is relatively 
straightforward to introduce new coding information into cloned genes, 
large numbers of novel proteins must often be assayed to determine 
whether a particular property has been created. 

The process for generating amino acid coding changes at the DNA 
level is called directed mutagenesis. Determining which amino acids of a 
protein should be changed to attain a specific property is much easier if the 
three-dimensional structure of the protein, or a similar protein, has been 
well characterized by X-ray crystallographic analysis and other analytical 
procedures. However, for many proteins, such detailed information is often 
lacking, so directed mutagenesis becomes a trial-and-error strategy in 
which changes are made to those nucleotides that are most likely to yield a 
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particular change in a protein property. Then, of course, the protein 
encoded by each mutated gene has to be tested to ascertain whether the 
mutagenesis process has indeed generated the desired change. 

A large number of experimental approaches have been devised for the 
directed mutagenesis of cloned genes. Some of the methods are designed 
to mutate, delete, or add specific nucleotides of a cloned gene. Others intro¬ 
duce mutations randomly over a short segment of the cloned gene, thereby 
creating a panel of mutated proteins among which one or more may have 
the desired activity. 

Oligonucleotide-Directed Mutagenesis with MB DNA 

Oligonucleotide-directed mutagenesis (site-specific mutagenesis) is a 
straightforward method for producing defined point mutations in a 
cloned gene (Fig. 8.1). For this procedure, the investigator must know (1) 
the precise nucleotide sequence in the region of DNA that encodes the 
messenger RNA (mRNA) codon that is to be changed and (2) the amino 
acid changes that are being introduced. In an early version of this method, 
the cloned gene was inserted into the double-stranded form of an M13 
bacteriophage vector. The single-stranded form (M13 plus-strand) of the 
recombinant vector was isolated and mixed with a synthetic oligonucle¬ 
otide. The oligonucleotide had, except for 1 nucleotide, the sequence 
exactly complementary to a segment of the cloned gene. The nucleotide 
difference (i.e., mismatch) coincided precisely with the nucleotide of the 
mRNA codon that was targeted for change. In Fig. 8.1, the sequence ATT, 
which encodes the isoleucine codon, AUU, is to be changed to CTT, which 
encodes the leucine codon, CUU. The oligonucleotide hybridizes to the 
complementary region of the cloned gene if it is added in an amount much 
in excess of that of the M13 DNA, if the mismatch is near the middle of the 
oligonucleotide, and if the mixing is done at a low temperature in the pres¬ 
ence of a high concentration of salt. The 3' end of the hybridized oligo¬ 
nucleotide acts as a primer site for the initiation of DNA synthesis that 
uses the intact M13 strand as the template. The replication, which uses the 
four deoxyribonucleoside triphosphates, is catalyzed by the Klenow frag¬ 
ment of Escherichia coli DNA polymerase I. T4 DNA ligase is added to 
ensure that the last nucleotide of the synthesized strand is joined to the 5' 
end of the primer. Since in vitro DNA synthesis is often incomplete, par¬ 
tially double-stranded M13 molecules must be removed from the mixture 
by sucrose gradient centrifugation. 

Each complete double-stranded M13 molecule, now containing the 
mismatched nucleotide, is introduced into £. coli cells by transformation. 
The infected cells produce M13 virus particles, which eventually lyse the 
cells and form plaques. Theoretically, because plasmid DNA is replicated 
semiconservatively, half of the phage that are formed carry the wild-type 
sequence and the other half contain the mutated sequence that has the 
specified nucleotide change. Phage produced in the initial transformation 
step are propagated in £. coli, and particles that contain only the mutated 
gene are identified by DNA hybridization under highly stringent condi¬ 
tions. The original oligonucleotide containing the mismatched nucleotide 
is the probe in these hybridization experiments and will bind only to the 
mutated gene under these conditions. After the double-stranded form of 
M13 is isolated, the mutated gene is excised by digestion with restriction 
enzymes and then spliced onto an £. coli plasmid expression vector. For 
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FIGURE 8.1 Oligonucleotide-directed mutagenesis. Single-stranded bacteriophage 
M13 (M13 + strand), carrying a cloned gene, is annealed with a complementary 
synthetic oligonucleotide containing one mismatched base, i.e., one base that is not 
complementary to its counterpart in the target DNA. With the oligonucleotide as 
the primer, DNA synthesis is catalyzed by the Klenow fragment of E. coli DNA 
polymerase I; the cloned gene and the M13 vector are the templates. Synthesis con¬ 
tinues until the entire strand is copied. The newly synthesized DNA strand is circu¬ 
larized by T4 DNA ligase. The ligation reaction mixture is used to transform E. coli. 
Both the target DNA with its original sequence and the mutated sequence are 
present in the progeny M13 phage. dNTPs, deoxynucleoside triphosphates. 


further study, the altered protein is expressed in and purified from the E. 
coli cells. 

In actuality, when oligonucleotide-directed mutagenesis is used, the 
expected 50% of the M13 viruses carrying the mutated form of the target 
gene are not recovered. Rather, for a variety of technical reasons, only 
around 1% of the plaques actually contain phage carrying the mutated 
gene. Consequently, the oligonucleotide-directed mutagenesis method 
has been modified in several ways to enrich for the number of mutant 


































phage plaques that can be obtained. For example, one approach is to 
introduce the M13 viral vector carrying the gene that is to be mutagen- 
ized into an E. coli strain that has two defective enzymes of DNA metabo¬ 
lism (Fig. 8.2). One enzyme is a defective form of dUTPase (dut). Cells 
without a functional dUTPase have an elevated intracellular level of 

FIGURE 8.2 Enrichment of mutated M13 by passage of the parental DNA through a 
dut ung strain of E. coli. The target DNA is cloned into the double-stranded replica¬ 
tive form of bacteriophage M13, which is then used to transform a dut ling strain of 
E. coli. The dut mutation causes the intracellular level of dUTP to be elevated; the 
high level of nucleotide leads to the incorporation of a few dUTP residues (U). The 
ung mutation prevents the removal of any incorporated uracil residues. Following 
in vitro oligonucleotide-directed mutagenesis, the double-stranded M13 vector 
with the mutated DNA is introduced into wild-type E. coli. The wild-type ung gene 
product (uracil N-glycosylase) removes any uracil residues from the parental 
strand, so a significant portion of the parental strand is degraded. The mutated 
strand remains intact because it does not contain uracil. It serves as a template for 
DNA replication, thereby enriching the yield of M13 bacteriophage carrying the 
mutated gene. 
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dUTP, which in turn causes a few dUTP residues to be incorporated into 
DNA during replication instead of dTTP The other enzyme is a defective 
uracil N-glycosylase (ling). In the absence of functional uracil N-glycosylase, 
the dUTP residues that were spuriously incorporated into DNA cannot be 
removed. The single-stranded M13 DNA that is produced by this £. coli 
strain has approximately 1% of its thymidines replaced by uridines. A 
mismatched oligonucleotide primer is hybridized to this substitution- 
containing M13 DNA, and the second strand of M13 is prepared by in 
vitro synthesis and ligation. The double-stranded bacteriophage DNA is 
next introduced into an £. coli strain that contains a functional ung gene. 
The active uracil N-glycosylase in the host cells then removes uridine 
residues from the transforming M13 DNA (Fig. 8.2). As a result, the 
original M13 template strand is degraded and only the mutated strand, 
which contains no dUTP, is replicated. In this way, the yield of Ml 3 bac¬ 
teriophage carrying a gene with a site-specific mutation is significantly 
increased. 

Oligonucleotide-Directed Mutagenesis with Plasmid DNA 

The major drawback to bacteriophage Ml 3 oligonucleotide-directed muta¬ 
genesis is the large number of time-consuming steps that need to be per¬ 
formed before the mutated form of the target gene is eventually isolated. 
As an alternative to the M13 system, a number of protocols that allow oli¬ 
gonucleotide-directed mutagenesis to be performed with plasmid rather 
than M13 DNA have been developed. With this approach, the need to sub¬ 
clone a target gene from a plasmid into M13 and then, after mutagenesis, 
clone it back into a plasmid is avoided. In one of the plasmid-based muta¬ 
genesis protocols, the target DNA is inserted into a multiple cloning site on 
a plasmid vector that contains a functional tetracycline resistance gene and 
a nonfunctional ampicillin resistance (Amp r ) gene as the result of a single 
nucleotide substitution in the middle of the Amp 1 gene (Fig. 8.3). The vector 
carrying the target DNA is transformed into £. coli host cells to increase the 
amount of DNA through plasmid replication. Following growth of the 
transformed cells, the double-stranded plasmid DNA is extracted and then 
denatured by treatment with an alkaline solution to form single-stranded 
circular DNA molecules. Three different oligonucleotides that anneal to 
one of the single-stranded circular DNA molecules are added to the sample 
of denatured plasmid DNA. One oligonucleotide is designed specifically to 
alter the target DNA, another is designed to correct the substituted nucle¬ 
otide in the nonfunctional ampicillin resistance gene, and the third is 
designed to change a single nucleotide in the tetracycline resistance gene so 
that the gene will become nonfunctional. The four deoxyribonucleoside 
triphosphates and T4 DNA polymerase, which has the same activity as the 
Klenow fragment of £. coli DNA polymerase I, are added to the reaction 
mixture, and the 3' ends of the annealed oligonucleotides act as primers for 
DNA synthesis with the intact circular DNA molecule as the template. The 
nicks in the synthesized strand are sealed by T4 DNA ligase. After syn¬ 
thesis and ligation are complete, the reaction mixture is used to transform 
£. coli cells. Transformants are selected for ampicillin resistance and tetra¬ 
cycline sensitivity. With this procedure, about 90% of the selected transfor¬ 
mants have the specified mutation in the target gene. In the remaining 
transformants, the target gene is unchanged because the oligonucleotide 
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FIGURE 8.3 Oligonucleotide-directed mutagenesis with plasmid DNA. The target 
DNA is inserted into the multiple cloning site (MCS) on the vector pALTER. 
Plasmid DNA isolated from E. coli cells is alkaline denatured before the mutagenic 
oligonucleotide, the ampicillin resistance (Amp r ) oligonucleotide, and the tetracy¬ 
cline sensitivity (Tet s ) oligonucleotide are annealed. The oligonucleotides act as 
primers for DNA synthesis by T4 DNA polymerase with the original strand as the 
template. The gaps between the synthesized pieces of DNA are sealed by T4 DNA 
ligase. The reaction mixture is used to transform E. coli host cells, and cells that are 
Amp 1 and Tet s are selected. 
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did not anneal to the target gene or it was bypassed during DNA synthesis. 
The cells with the specified mutation in the target gene are identified by 
DNA hybridization. All of the plasmids, host bacterial strains, enzymes, 
oligonucleotides (other than the one needed to alter the target gene), and 
buffers for this method are sold as a kit, facilitating its widespread use. 

PCR-Amplified Oligonucleotide-Directed Mutagenesis 

Researchers continually try to develop simpler and faster protocols. For 
site-specific mutagenesis, the polymerase chain reaction (PCR) can be 
exploited both to introduce the desired mutation and to enrich for the 
mutated gene. In fact, kits are often available to simplify the process; a 
researcher merely adds the target plasmid carrying the gene of interest and 
forward and reverse PCR primers that are typically 24 to 30 nucleotides in 
length, and following PCR, a high percentage of the plasmids produced 
will have the desired mutation. In this case, no special plasmid vectors are 
required; any plasmid up to approximately 10 kb in length is acceptable. 
For PCR-based mutagenesis point mutations, nucleotide changes are intro¬ 
duced in the middle of the primer sequence (Fig. 8.4). To create deletion 
mutations, primers must border the region of target DNA to be deleted on 
both sides and be perfectly matched to their annealing (or template) 
sequences. To create mutations with long insertions, a stretch of mis¬ 
matched nucleotides is added to the 5' end of one or both primers, while 
for mutations with short insertions, a stretch of nucleotides is designed in 
the middle of one of the primers. In all of these procedures, the only abso¬ 
lute requirements are that (1) the nucleotide sequence of the target DNA 
must be known and (2) the 5' ends of the primers must be phosphorylated. 
Following PCR amplification, the linear DNA is circularized by ligation 
with T4 DNA ligase. The circularized plasmid DNA is then used to trans¬ 
form E. coli by any standard procedure. Since this protocol yields a very 
high frequency of plasmids with the desired mutation, it is not necessary to 
utilize any enrichment procedures. Rather, screening three or four clones 
by sequencing the target DNA should be sufficient to find the desired 


FIGURE 8.4 Overview of the basic methodology to introduce point mutations, inser¬ 
tions, or deletions into DNA cloned into a plasmid. The forward and reverse 
primers are shown in red and green, respectively. The solid circles represent tem¬ 
plate DNA. The dotted lines represent newly synthesized DNA. The X indicates an 
altered nucleotide (s). 
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mutation. In sum, this procedure introduces a specified mutation (point, 
insertion, or deletion) into a cloned gene without the need to insert the 
cloned gene into bacteriophage M13; to use enrichment procedures, such as 
the dut ling system; or to subclone the mutated gene from M13 onto an 
expression plasmid vector. Given its simplicity and effectiveness, this pro¬ 
cedure has come to be widely used. 


Error-Prone PCR 


FIGURE 8.5 Error-prone PCR of a target 
gene yields a variety of mutated forms 
of the gene. Mutations are shown in 
blue. The horizontal arrows represent 
PCR primers. 
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Some of the temperature-stable DNA polymerases that are used to amplify 
target DNA by PCR occasionally insert incorrect nucleotides into the rep¬ 
licating DNA. If one is attempting to amplify a DNA with high fidelity, 
this is obviously a problem. On the other hand, if the construction of a 
library of mutants of the target gene is the objective, then this approach is 
a very powerful method for random mutagenesis. Moreover, with DNA 
up to 10 kilobase pairs (kb) in size, it is possible to vary the number of 
alterations per gene from about 1 to about 20 by modifying the DNA tem¬ 
plate concentration (Fig. 8.5). When error-prone PCR is performed using 
Taq DNA polymerase, which lacks proofreading activity, the error rate 
may be increased by adding Mn 2+ , by increasing the concentration of Mg 2+ , 
and by adding unequal amounts of the four deoxynucleoside triphos¬ 
phates to the reaction buffer. Alternatively, high error rates may be 
achieved with other temperature-stable DNA polymerases in the absence 
of Mn 2+ and with balanced amounts of the four deoxynucleoside triphos¬ 
phates. Following error-prone PCR, the randomly mutagenized DNA is 
cloned into expression vectors and screened for altered or improved pro¬ 
tein activity. The DNA from those clones that encode the desired activity 
is isolated and sequenced so that the relevant changes to the target DNA 
maybe elaborated. Error-prone PCR has been used to create enzymes with 
improved solvent and temperature stability and with enhanced specific 
activity. 

Random Mutagenesis with Degenerate Oligonucleotide Primers 

Unfortunately, investigators seldom know which specific nucleotide 
changes need to be introduced into a cloned gene to modify the properties 
of the target protein. Consequently, they must use methods that generate 
all the possible amino acid changes at one particular site. For example, 
oligonucleotide primers can be synthesized with any of the four nucle¬ 
otides at defined positions. This pattern of sequence degeneracy is gener¬ 
ally achieved by programming an automated DNA synthesis reaction to 
add a low level (usually a few percent) of each of the three alternative 
nucleotides each time a particular nucleotide is added to the chain (Fig. 
8.6). In this way, the oligonucleotide primer preparation contains a hetero¬ 
geneous set of DNA sequences that will generate a series of mutations that 
are clustered in a defined portion of the target gene. 

This approach has two advantages. (1) Detailed information regarding 
the roles of particular amino acid residues in the functioning of the protein 
is not required. (2) Unexpected mutants encoding proteins with a range of 
interesting and useful properties may be generated because the introduced 
changes are not limited to one amino acid. Of course, should none of the 
mutants yield a protein with the properties that are being sought, then it 
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FIGURE 8.6 Chemical synthesis of oligonucleotide primers with any of the four 
nucleotides at defined positions. In this case, the flask with G phosphoramidite 
consists of a mixture of nucleotides, such as 94% G, 2% A, 2% C, and 2% T, leading 
to a mixture of oligonucleotides that may have A, C, or T at the sites where G is the 
specified nucleotide. 


may be necessary to repeat the entire procedure with a set of degenerate 
primers that is complementary to a different region of the gene. 

In practice, partially degenerate oligonucleotides may be incorporated 
into a target gene by a variety of procedures. One strategy entails inserting 
a target gene into a plasmid between two unique restriction endonuclease 
sites and using PCR, in separate reactions, to amplify overlapping frag¬ 
ments (Fig. 8.7). The primer pair that is used to amplify the left fragment 
consists of mismatched oligonucleotides that were synthesized to contain 
degenerate oligonucleotides and that bind to the lower strand of the target 
DNA, along with a regular, completely complementary primer that hybrid¬ 
izes to a region of the upper strand that flanks the left unique restriction 
endonuclease site. For the right fragment, the PCR primers are the mis¬ 
matched oligonucleotides that were synthesized to contain degenerate oli¬ 
gonucleotides and that bind to the upper strand of the target DNA, along 
with a primer that is complementary to a region of the lower strand that lies 
outside the second (right) unique restriction endonuclease site. After PCR 
amplification, the products are purified and combined. Denaturation and 
reannealing of the DNA in the mixture produce some DNA molecules that 
overlap in the target region. DNA polymerase is then used to form complete 
double-stranded DNA molecules. These molecules are amplified by PCR 
with a pair of primers that bind to opposite ends of the DNA molecule. The 
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FIGURE 8.7 Random mutagenesis of a target DNA by using degenerate oligonucle¬ 
otides and PCR. The left and right portions of the target DNA are amplified sepa- 
ratelyby PCR. Theprimerpairsareshownbyhorizontal arrows. Amutation-producing 
oligonucleotide is shown as a line with three spikes; each spike denotes a position 
that contains a nucleotide that is not found in the native gene. The amplified frag¬ 
ments are purified, denatured to make them single stranded, and then reannealed. 
Complementary regions of overlap are formed between complementary mutation- 
producing oligonucleotides. The single-stranded regions are made double stranded 
with DNA polymerase, and then the entire fragment is amplified by PCR. The 
resultant product is digested with restriction endonucleases A and B and then 
cloned into a vector that has been digested with the same enzymes. 


amplified DNA is then treated with the two restriction enzymes for which 
there are unique sites at the ends of the fragment, and the DNA is then 
cloned into a suitable plasmid vector. This procedure results in the produc¬ 
tion of an altered gene that has mutated sites in the region of the overlap of 
the original oligonucleotides. 
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MILESTONE 


Oligonucleotide-Directed Mutagenesis Using 
M13-Derived Vectors: an Efficient and General 
Procedure for the Production of Point Mutations in 
Any Fragment of DNA 

M. }. Zoller and M. Smith 
Nucleic Acids Res. 10:6487-6500,1982 


T he technique of oligonucleotide- 
directed mutagenesis (site-spe¬ 
cific mutagenesis) was 
developed mainly in the laboratory of 
Michael Smith as incremental changes 
to the technique of "marker rescue." 

In the marker rescue procedure, a 
mutation in a bacteriophage genomic 
DNA is corrected after the mutant 
DNA is annealed to a fragment of 
complementary wild-type DNA. 
Subsequently, it was demonstrated 
that a chemically synthesized oligonu¬ 


cleotide annealed to bacteriophage 
genomic DNA could produce a spe¬ 
cific mutation. Unfortunately, these 
and other early procedures for oligo¬ 
nucleotide-directed mutagenesis 
required specialized skills and initially 
could be performed in only a few 
research laboratories. However, the 
procedure using bacteriophage M13 
described by Zoller and Smith made it 
a relatively straightforward matter for 
thousands of laboratories throughout 
the world to specifically and rapidly 


alter the DNA sequence of any cloned 
gene. The key to the success of the 
protocol developed by Zoller and 
Smith lay in the use of E. coli bacterio¬ 
phage M13. It was possible to clone 
foreign DNA into the double-stranded 
form of the virus, add an oligonucle¬ 
otide with a specified change to the 
single-stranded form to produce a 
mutated DNA copy, and then recover 
the mutated double-stranded form in 
a relatively high yield. Since it was 
originally described, this procedure 
has been enhanced, simplified, and 
optimized and has been used by a 
large number of researchers to specifi¬ 
cally modify thousands of different 
genes. 


Random Insertion/Deletion Mutagenesis 

The technique of error-prone PCR, which is quite commonly used to intro¬ 
duce random changes into a target gene, is somewhat limited in the types 
of changes that can be introduced. That is, since errors are typically intro¬ 
duced into DNA at no more than one or two per 1,000 nucleotides, only 
single nucleotides are replaced within a triplet codon, yielding only a lim¬ 
ited number of amino acid changes. As an alternative to error-prone PCR, 
researchers have developed the technique of random insertion/deletion 
mutagenesis. With this approach, it is possible to delete a small number of 
nucleotides at random positions along the gene and, at the same time, 
insert either specific or random sequences into that position. This method 
entails the following steps (Fig. 8.8). 

1. An isolated gene fragment with different restriction endonuclease 
sites at each end is ligated at one end to a short nonphosphorylated 
linker that leaves a small gap in the DNA. The gap is a consequence 
of the fact that the 5' nucleotide from the linker is not phosphory- 
lated and therefore cannot be ligated to an adjacent 3'-OH group. 

2. After restriction enzyme digestion that creates compatible sticky 
ends, the gene fragment is cyclized with T4 DNA ligase to create a 
circular double-stranded gene fragment with a nick in the anti- 
sense strand. 

3. The nicked strand is degraded by digestion with the enzyme T4 
DNA polymerase (which has exonuclease activity). 

4. The single-stranded DNA is randomly cleaved at single positions 
by treating it with a cerium(IV)-ethylenediaminetetraacetic acid 
(EDTA) complex. 

5. The linear single-stranded DNAs are ligated to a linker (containing 
several additional nucleotides selected for insertion at one end), 
and the entire mutagenesis library is PCR amplified. 
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FIGURE 8.8 Schematic representation of 
random insertion/deletion protocol to 
introduce random mutations into a gene 
of interest. The inserted DNA is shown in 
yellow and the linkers in green. Adapted 
from Murakami et al., Nat. Biotechnol. 
20:76-81, 2002. 
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6. The linkers are removed by restriction enzyme digestion. 

7. The constructs are made blunt ended by filling in the single- 
stranded overhangs using the Klenow fragment of E. coli DNA 
polymerase I and then cyclized again by T4 DNA ligase. 

8. The amplified products are digested with appropriate restriction 
enzymes, cloned into a plasmid vector, and then tested for activity 

With this approach, it is possible to insert any small DNA fragment (car¬ 
ried on a linker) into the randomly cleaved single-stranded DNA, with 
the result that a much greater number of modified genes may be gener¬ 
ated than by error-prone PCR. The mutations that are developed by this 
procedure may be used to select protein variants with a wide range of 
activities. 

DNA Shuffling 

Some biologically important proteins, such as a-interferon (IFN-a) (see 
chapter 10), are encoded by a family of several related genes, with each 
protein having slightly different biological activity. If all, or at least several, 
of the genes or cDNAs for a particular protein have been isolated, it is pos¬ 
sible to recombine portions of these genes or cDNAs to produce hybrid or 
chimeric forms (Fig. 8.9). This "DNA shuffling" is done in the hope that 
some of the hybrid proteins will have unique properties or activities that 
were not encoded in any of the original sequences. Also, some of the hybrid 
proteins may combine important attributes of two or more of the original 
proteins, e.g., high activity and thermostability. 

The simplest way to shuffle portions of similar genes is through the 
use of common restriction enzyme sites (Fig. 8.10). Digestion of two or 
more of the DNAs that encode the native forms of similar proteins with 
one or more restriction enzymes that cut the DNAs in the same place, fol¬ 
lowed by ligation of the mixture of DNA fragments, can potentially gen¬ 
erate a large number of hybrids. For example, two DNAs, each of which 
has three unique restriction enzyme sites, can be recombined (shuffled) to 
produce 14 different hybrids in addition to the original DNA (Fig. 8.10). 
Another way to shuffle DNA involves combining several members of a 
gene family, fragmenting the mixed DNA with deoxyribonuclease I 
(DNase I), selecting smaller DNA fragments, and PCR amplifying these 
fragments. During PCR, gene fragments from different members of a gene 
family cross-prime each other after DNA fragments bind to one another in 
regions of high homology/complementarity. The final full-length prod¬ 
ucts are obtained by including "terminal primers" in the PCR. After 20 to 
30 PCR cycles, a panel of hybrid (full-length) DNAs will be established 
(Fig. 8.11). The hybrid DNAs are then used to create a library that can be 
screened for the desired activity (a task that may be the most difficult and 
labor-intensive part of the entire process). Although DNA shuffling works 
well with gene families—it is sometimes called molecular breeding—or 
with genes from different families that nevertheless have a high degree of 
homology, the technique is not especially useful when proteins have little 
or no homology. Thus, the DNAs must be very similar to one another or 
the PCR will not proceed. To remedy this situation and combine the genes 
of dissimilar proteins, several variations of the DNA-shuffling protocol 
have been described. 



Wild type 
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FIGURE 8.9 Schematic representation of 
the changes introduced into a protein 
by either random mutagenesis or error- 
prone PCR, both of which cause single- 
amino-acid substitutions, and by DNA 
shuffling, in which genes are formed 
with large regions from different 
sources. 
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FIGURE 8.10 The 14 different hybrid 
genes that can be generated by com¬ 
bining restriction enzyme fragments 
from two genes from the same gene 
family that have three different restric¬ 
tion sites in common. RE, restriction 
enzyme. 


One procedure that was developed to combine the genes of dissimilar 
proteins and that does not rely on PCR amplification of DNA fragments is 
called nonhomologous random recombination. In this procedure (Fig. 
8.12), DNAs from different sources (either defined or random DNA 
sequences or a mixture of both) are combined and then partially digested 
with DNase I. These DNA fragments, which include a wide variety of sizes, 
are made blunt ended by digestion with the enzyme T4 DNA polymerase. 
This enzyme both fills in 5' overhanging nucleotides and degrades 3' over¬ 
hanging nucleotides. The DNA fragments are then mixed with a synthetic 
DNA fragment that forms a hairpin loop and contains a specific restriction 
enzyme site before the entire mixture is ligated by the addition of the 
enzyme T4 DNA ligase to form extended mosaic DNA hairpin molecules of 
variable lengths. The average length of these hairpin structures is dictated 
by the ratio between the blunt-ended DNA and the added DNA hairpins, 
which prevent further concatemerization once they are ligated to the ends. 
Finally, restriction enzyme digestion removes the hairpin loops so that the 
resulting sticky-ended DNA fragments can be inserted into plasmid vec¬ 
tors and tested for various activities. Because this process randomly recom¬ 
bines DNA fragments, only a very small fraction of the recombined DNAs 
are likely to encode the desired activity. 

Mutant Proteins with Unusual Amino Acids 

Essentially any protein can be altered by substituting one amino acid for 
another using directed mutagenesis. However, this approach is limited to 
the 20 amino acids that are normally used in protein synthesis. One way to 
increase the diversity of the proteins formed after mutagenesis is to intro¬ 
duce synthetic amino acids with unique side chains at specific sites. To do 
this, E. coii was engineered to produce both a novel transfer RNA (tRNA) 
that is not recognized by any of the existing £. coli aminoacyl-tRNA syn¬ 
thetases but nevertheless functions in translation and a new aminoacyl- 
tRNA synthetase that aminoacylates only that novel tRNA. A novel tRNA 
and unique aminoacyl-tRNA synthetase pair from the archaebacterium 
Methanococcus jannaschii was used as a starting point for this system. The 
tyrosine-tRNA synthetase from M. jannaschii can add an amino acid to an 
amber suppressor tRNA that is a mutant of its tyrosine-tRNA. An amber 
suppressor tRNA is a modified tRNA that can insert an amino acid into a 
protein in places where the mRNA contains an amber codon, UAG, which 
normally codes for a stop, i.e., the cessation of protein synthesis. To prevent 
the translational fusion of proteins whose mRNAs normally code for a stop 
with a UAG with downstream proteins, suppression is always less than 
100% and is often dependent upon the nucleotides surrounding the stop 
codon. The amino acid specificity of the tyrosine-tRNA synthetase from M. 
jannaschii is altered by random mutagenesis of its gene so that, instead of 
tyrosine, it places O-methyl-L-tyrosine onto the tRNA. Specifically, a cloned 
version of the target gene is modified by oligonucleotide-directed muta¬ 
genesis so that it contains a 5'-TAG-3' in that portion of the DNA that 
encodes the amino acid that is targeted for change to O-methyl-L-tyrosine. 
Once the modified DNA has been selected, it is used to transform an E. coii 
strain that was previously engineered to produce the O-methyl-L-tyrosine- 
tRNA. The engineered E. coii strain inserts O-methyl-L-tyrosine into pro¬ 
teins that contain a UAG stop codon, resulting in a full-length target 
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protein containing the modified amino acid. Had the mutant gene been 
expressed in wild-type E. coli, a truncated version of the protein would 
have been produced (Fig. 8.13). This system may be manipulated to insert 
a variety of different amino acid analogues into specified sites within pro¬ 
teins in the hope of producing functional proteins with altered activities 
compared with the native form. In a similar approach to this problem, 
researchers modified a portion of the valine-tRNA synthetase gene so that 
the altered enzyme was able to insert the nonstandard amino acid amin- 
obutyrate into proteins. While the full potential of these approaches has yet 
to be realized, it is nevertheless clear that it is now possible to produce 
proteins containing unusual chemical structures and possibly having 
unique properties. 


Protein Engineering 

About 20 of the many thousands of enzymes that have been studied and 
characterized biochemically account for over 90% of the total amount of 
enzymes that is currently being used industrially. Table 8.1 lists some of the 
most important enzymes and their primary uses. A major reason why addi¬ 
tional enzymes are not used to any great extent in industrial processes is 
that an activity that has evolved to perform a particular function for a 
microorganism, animal, or plant under natural conditions usually is not 
well suited for a highly specialized industrial application. Most enzymes 
are easily denatured by exposure to the conditions, such as high tempera¬ 
ture and the presence of organic solvents, that are used in many industrial 
processes. Although thermotolerant enzymes can be isolated from thermo¬ 
philic microorganisms, these organisms often lack the particular enzyme 
that is required for use in industrial processes. However, with the avail¬ 
ability of directed mutagenesis and gene cloning, these constraints are no 
longer significant. 
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FIGURE 8.11 Some of the hybrid DNAs 
that can be generated during PCR 
amplification of three members of a 
gene family. 


Adding Disulfide Bonds 

The thermostability of a protein can be increased by creating a molecule 
that will not readily unfold at elevated temperatures. In addition, these 
thermostable enzymes are often resistant to denaturation by organic sol¬ 
vents and nonphysiological conditions, such as extremes of pH. The addi¬ 
tion of disulfide bonds (through the introduction of specifically placed 
cysteine residues) can usually significantly increase the stability of a pro¬ 
tein (Fig. 8.14). The problem is whether extra disulfide bonds perturb the 
normal functioning of a protein. 

T4 lysozyme. In one study, six variants of the enzyme T4 lysozyme were 
constructed by oligonucleotide-directed mutagenesis, which introduced 
new internal disulfide bonds. Specifically, two, four, or six amino acid resi¬ 
dues at a time were changed to cysteine, thereby generating one, two, or 
three disulfide bonds, respectively (Table 8.2). The side chains of the amino 
acid residues that were targeted to become cysteine residues were known 
to be spatially close to each other in the active enzyme. This proximity 
ensured that the overall conformation of the molecule would remain essen¬ 
tially unaffected by the formation of the new disulfide linkages. The tar¬ 
geted amino acids were not involved in the active site of the enzyme, which 
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FIGURE 8.12 Schematic representation of nonhomologous random recombination. 
Different DNAs (shown in different colors) are mixed together, partially digested 
with DNase I, blunt ended by digestion with T4 DNA polymerase, size fractionated, 
ligated with synthetic hairpin DNAs to form extended hairpins, restriction enzyme 
digested to remove the hairpin ends and generate sticky ends, and then ligated into 
plasmid vectors. 


is the portion of the enzyme molecule that is probably most sensitive to 
small changes in conformation. The newly introduced cysteines created 
disulfide bonds between positions 3 and 97, 9 and 164, and 21 and 142 of 
the enzyme, where the numbers denote the amino acid positions in the 
polypeptide, starting from the N terminus. 

After mutagenesis, each mutated gene was identified and expressed in 
E. coli. The engineered enzymes were purified, and the enzymatic activity 
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FIGURE 8.13 Schematic representation of the production of a protein with a modified 
(nonstandard amino acid) side chain. The start codon is highlighted in green, and 
the stop codons are in red. The inserted amino acid analogue is shown in blue. 


TABLE 8.1 Some industrial enzymes and their commercial uses 


Enzyme 

Industrial use(s) 

a-Amylase 

Beer making, alcohol production 

Aminoacylase 

Preparation of L-amino acids 

Bromelain 

Meat tenderizer, juice clarification 

Catalase 

Antioxidant in prepared foods 

Cellulase 

Alcohol and glucose production 

Ficin 

Meat tenderizer, juice clarification 

Glucoamylase 

Beer making, alcohol production 

Glucose isomerase 

Manufacture of high-fructose syrups 

Glucose oxidase 

Antioxidant in prepared foods 

Invertase 

Sucrose inversion 

Lactase 

Whey utilization, lactose hydrolysis 

Lipase 

Cheese making, preparation of flavorings 

Papain 

Meet tenderizer, juice clarification 

Pectinase 

Clarifying fruit juices, alcohol production 

Protease 

Detergent, alcohol production 

Rennet 

Cheese making 
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Engineered protein 


FIGURE 8.14 Schematic representation of 
the engineering of a protein that con¬ 
tains two engineered disulfide bridges 
(colored lines in the bottom diagram) 
that hold together and may stabilize 
regions of the protein that are often 
separated in the primary amino acid 
sequence. 


and thermostability of each were determined (Table 8.2). The thermostability 
of a protein is often defined as the temperature at which the overall structure 
of the protein is 50% denatured; the state of denaturation can be assessed by 
monitoring the circular dichroism of the protein in solution. The wild-type 
(native) form of the enzyme T4 lysozyme has two free cysteine residues that 
both exist as free sulfhydryl groups, neither of which is involved in a disul¬ 
fide bond. In the so-called pseudo-wild-type enzyme, these cysteines were 
changed by oligonucleotide-directed mutagenesis to threonine (Thr) and 
alanine (Ala) without altering either the activity or the thermostability of the 
enzyme. Consequently, the pseudo-wild-type sequence provided a standard 
for comparing variants with potentially thermostabilizing disulfide bonds 
and also prevented spurious disulfide bonding between the introduced 
cysteine residues and the naturally occurring ones. The constructed 
lysozyme derivatives had from one to three disulfide bonds. 

The results of this experiment indicate that the thermal stability of the 
enzyme increases as a result of the presence of disulfide bonds, with the 
most thermostable variant being the one with the largest number of disul¬ 
fide bonds, and that some variants (C, E, and F in Table 8.2), which are 
more thermostable than the wild type or pseudo-wild type, have lost their 
enzymatic activity. The loss of enzymatic activity in three of the variants 
probably reflects a distortion of the peptide backbone of the molecule con¬ 
taining a disulfide linkage between residues 21 and 142. Often, the engi¬ 
neering of a protein is a trial-and-error process. Hence, the precise amino 
acid changes that yield the "best" variant are not always obvious. However, 
from this experiment, it is clear that increasing disulfide bonds to enhance 
protein stability is feasible. 

Xylanase. In a similar study, the development of a temperature-stable 
mutant of the enzyme xylanase from Bacillus circulans was undertaken. 
During the making of paper, wood pulp is chemically treated to remove the 
hemicellulose that would otherwise contribute to the discoloration of the 
paper product. Unfortunately, this step results in the creation of large 
amounts of potentially toxic effluent. From an environmental perspective, 
treatment of wood pulp with xylanase, which degrades hemicellulose, is 
preferred to pulping. Treatment of wood pulp with this enzyme could lower 
the amount of bleaching chemical that would otherwise be required as a 
part of this process. However, the step at which xylanase would be added 


TABLE 8.2 Properties of T4 lysozyme and six engineered variants 


Enzyme 



Amino acid at position: 




% Activity 

T m (°C) 

3 

9 

21 

54 

97 

142 

164 


wt 

lie 

lie 

Thr 

Cys 

Cys 

Thr 

Leu 

0 

100 

41.9 

pwt 

lie 

He 

Thr 

Thr 

Ala 

Thr 

Leu 

0 

too 

41.9 

A 

Cys 

He 

Thr 

Thr 

Cys 

Thr 

Leu 

1 

96 

46.7 

B 

He 

Cys 

Thr 

Thr 

Ala 

Thr 

Cys 

1 

106 

48.3 

C 

He 

He 

Cys 

Thr 

Aka 

Cys 

Leu 

1 

0 

52.9 

D 

Cys 

Cys 

Thr 

Thr 

Cys 

Thr 

Cys 

2 

95 

57.6 

E 

He 

Cys 

Cys 

Thr 

Ala 

Cys 

Cys 

2 

0 

58.9 

F 

Cys 

Cys 

Cys 

Thr 

Cys 

Cys 

Cys 

3 

0 

65.5 


Adapted from Matsumura et al.. Nature 342:291-293,1989. 

wt, wild-type T4 lysozyme; pwt, pseudo-wild-type enzyme; A through F, six engineered cysteine variants; -S-S-, disulfide bonds; T m , "melting" temperature 
(a measure of thermostability). 
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follows the hot-alkali treatment of the pulp. While it is possible to lower the 
pH of this material by adding acid, current industry practice is directed 
toward using less water to cool the pulp, so if xylanase were to be used in 
this process, it must function efficiently at relatively high temperatures. 

Computer modeling of the three-dimensional structure of xylanase 
was used to predict sites along the polypeptide chain where one, two, or 
three disulfide bridges could be introduced in order to stabilize the enzyme 
without disrupting its catalytic activity. All of the eight derivatives of the B. 
circulans xylanase that were generated showed an increase in thermosta¬ 
bility compared with the wild type, and three of the eight mutants were as 
enzymatically active as the wild type at 60°C. Moreover, one mutant, in 
which a disulfide bridge between the N- and C-terminal ends of the 
enzyme was introduced, was nearly twice as active as the wild type at 60°C 
and retained more than 85% of its activity after a 2-hour incubation at 60°C, 
whereas the wild type lost all of its activity after only 30 minutes at this 
temperature (Fig. 8.15). These results indicate that the thermostability of 
other enzymes can be enhanced, provided that sufficiently detailed X-ray 
crystallographic information is available. The success of the protein engi¬ 
neering and the laboratory testing notwithstanding, it remains to be deter¬ 
mined whether a thermostable xylanase can be effectively incorporated 
into a commercial process for the manufacture of paper. 

Human pancreatic RNase. Ribonuclease (RNase) from bull semen can act as 
an antitumorigenic agent. In both in vitro and in vivo experiments, a dimeric 
form of this protein is internalized into tumor cells by non-receptor-medi- 
ated endocytosis, and when it reaches the cytosol, it selectively degrades 
ribosomal RNA (rRNA), thereby blocking protein synthesis and causing cell 
death. The dimeric form of this enzyme consists of two identical subunits 
covalently joined by two adjacent intersubunit disulfide bridges and stabi¬ 
lized by a small number of noncovalent interactions. The antitumor activity 
of bull semen RNase is dependent on the dimeric structure of the protein. 


FIGURE 8.15 Addition of a disulfide bond between the N and C termini of B. circulans 
xylanase stabilizes the protein. Its activity at room temperature is doubled, and at 
60°C it is largely protected against heat inactivation. 

100 % 200 % 



0% 85% 
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FIGURE 8.17 Flowchart showing the sol¬ 
ubilization and renaturation of the 
dimeric human pancreatic RNase from 
E. coli inclusion bodies. 

c -x 

Inclusion body 


1. 6 M guanidine HC1 

2. Reduced glutathione 

V 

\ 

Soluble, unfolded enzyme 


1. Dilute 20-fold 

2. Reoxidize 

_ f 

Enzyme-glutathione 
mixed disulfide 


1. Dithiothreitol 

2. Dialysis 

_f_ 

Active dimeric enzyme 


A B 



FIGURE 8.16 Schematic representation of native monomeric human pancreatic 
RNase (A) and engineered dimeric human pancreatic RNase (B). The native protein 
was modified by changing glutamine 28 to leucine, arginine 31 to cysteine, arginine 
32 to cysteine, and asparagine 34 to lysine. The monomeric enzyme is approxi¬ 
mately 13.6 kDa, while the dimeric form is approximately 27.2 kDa. 


the only dimeric RNase from the pancreatic-like RNase superfamily of pro¬ 
teins. However, human antibodies against bull semen RNase are often pro¬ 
duced following repeated or prolonged use of this therapeutic protein, 
thereby eventually negating the usefulness and effectiveness of bull semen 
RNase in treating tumor cells. For this reason, monomeric human pancreatic 
RNase was engineered to be an antitumorigenic agent. 

The amino acid sequence of bull semen RNase is more than 70% iden¬ 
tical to the amino acid sequence of human pancreatic RNase and therefore 
could be used as a model to determine which amino acids might be altered 
in the human enzyme to participate in dimer formation (Fig. 8.16). When 
dimeric human pancreatic RNase was formed in E. coli, the protein was 
localized in an insoluble inclusion body (Fig. 8.17). Solubilization of the 
inclusion body and renaturation of the unfolded protein yielded dimeric 
human pancreatic RNase that displayed a slightly lower level of antitum¬ 
origenic activity than bull semen RNase. Depending on the tumor cell line, 
up to twice as much of the engineered protein was required. Since the 
dimeric human pancreatic RNase did not impair the functioning of normal 
human diploid fibroblast cells, this engineered protein is a good candidate 
to become an important human therapeutic agent. 

Changing Asparagine to Other Amino Acids 

When proteins are exposed to high temperatures, asparagine and glu¬ 
tamine residues may undergo deamidation, a reaction that releases 
ammonia. With the loss of the amide moiety, these amino acids become 
aspartic acid and glutamic acid, respectively, which have different chemical 
properties at physiological pH than asparagine and glutamine, resulting in 
localized changes in the folding of the peptide chain that may lead to a loss 
of activity. 

In one study, the effect of changing some asparagine residues in the 
Saccharomyces cerevisiae enzyme triosephosphate isomerase was examined. 
This enzyme consists of two identical subunits; each subunit has two aspar¬ 
agine residues that may contribute to the thermosensitivity of the enzyme 
because they are located at the subunit interface. By oligonucleotide- 
directed mutagenesis, the asparagine residues at positions 14 and 78 were 
targeted for change (Table 8.3). Converting either of these asparagine resi- 
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dues to threonine or isoleucine enhanced the thermostability of the enzyme, 
whereas as predicted, changing one of the asparagine residues to aspartic 
acid reduced thermostability. When both asparagine residues were changed 
to aspartic acid residues, the resulting enzyme was unstable even at 
ambient temperature and had low enzymatic activity (data not shown). 

The engineered proteins were also tested for their sensitivities to pro¬ 
teolytic digestion. A positive correlation between thermostability and resis¬ 
tance to proteolysis was observed. On the basis of these results, it should 
be possible to generate thermostable forms of other enzymes by mutating 
nonessential asparagine codons. In fact, a long-acting human insulin ana¬ 
logue was produced by replacing an aspartic acid residue with glycine. 
This analogue was recently approved for use as a human therapeutic 
agent. 

Reducing the Number of Free Sulfhydryl Residues 

Occasionally, an expressed foreign protein is less active than expected. 
Protein engineering can be used to increase this activity. For example, when 
the cDNAfor human IFN-p was initially cloned and expressed in E. coli, the 
protein product showed only a disappointing 10% of the antiviral activity 
of the authentic, glycosylated form. And although reasonable amounts of 
IFN-p were synthesized, most of the IFN-p was found to exist as dimers 
and higher oligomers that were inactive. 

Analysis of the DNA sequence of the IFN-P gene revealed that the 
native protein has three cysteine residues, so one or more of these residues 
could be involved in the intermolecular disulfide bonding that produced 
the dimers and oligomers in E. coli but not in human cells. It was reasoned 
that conversion of one or more of the cysteine codons into serine codons 
might result in an IFN-P derivative that would not form oligomers. Serine 
was chosen to replace cysteine because the structures of the two amino 
acids are identical, except that serine contains oxygen instead of sulfur and 
as a result cannot form disulfide bonds. 

When this study was undertaken, detailed information about the struc¬ 
ture of IFN-p was lacking, so the researchers were forced to rely on data 
derived from related proteins. In other words, they did not know which of 
the three cysteine residues contributed to intermolecular disulfide bond 
formation. However, the locations of the cysteine residues that form the 
internal disulfide bonds in IFN-a, a structurally similar molecule, were 
known, so a comparison of the primary amino acid sequence of this mole- 


TABLE 8.3 Stability at 100°C of the yeast enzyme triosephosphate 
isomerase and its engineered derivatives 


Enzyme 

Amino acid at position: 

Half-life (min) 

14 

78 

Wild type 

Asn 

Asn 

13 

Variant A 

Asn 

Thr 

17 

Variant B 

Asn 

He 

16 

Variant C 

Thr 

lie 

25 

Variant D 

Asp 

Asn 

11 


Adapted from Ahern et al., Proc. Natl. Acad. Sci. USA 84:657-679,1987. 

Enzyme stability is expressed as the half-life, or rate of enzyme inactivation, at 
100°C. A longer half-life indicates a more stable enzyme. 
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cule with that of IFN-p was made (Fig. 8.18). Alignment of the amino acid 
sequences of the two protein molecules indicated that Cys-31 and Cys-141 
in IFN-p are located in positions similar to those of Cys-29 and Cys-138 in 
IFN-a. Because Cys-29 and Cys-138 in IFN-a are involved in the formation 
of an intramolecular disulfide bond, it seemed reasonable to assume that 
Cys-17 of IFN-p was not involved in intramolecular disulfide bonding and 
was therefore a good candidate for modification. 

This deduction proved to be correct. No multimeric complexes were 
formed when a Ser-17 variant of IFN-p (Ser-17 IFN-p) was expressed in E. 
coli. Moreover, the Ser-17 IFN-p had a specific activity similar to that of 
authentic, native IFN-p and was more stable during long-term storage than 
the native form. 

Increasing Enzymatic Activity 

In addition to stabilizing an enzyme by directed mutagenesis, it may be 
feasible to modify its catalytic function. Currently, detailed information 
about the geometry of the active site of a well-characterized enzyme is 
required in order to predictably alter enzymatic activity. With such data, 
researchers are able to deduce which specific changes might be necessary 
to modulate the substrate-binding specificity of an enzyme (Box 8.1). 

Tyrosyl-tRNA synthetase. In an early demonstration of how directed muta¬ 
genesis could enhance enzyme activity, the enzyme tyrosyl-tRNA synthetase 


FIGURE 8.18 Positions of the cysteine residues and disulfide bonds of IFN-a and 
IFN-p. The known intramolecular disulfide bond in IFN-a is indicated by a dashed 
line, and the deduced intramolecular disulfide bond in IFN-P is indicated by a 
dotted line. 
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from B. stearothermophilus was modified with respect to substrate binding. 
Tyrosyl-tRNA synthetase catalyzes the aminoacylation of a tRNA that spe¬ 
cifically accepts tyrosine (tRNATyr) in a two-step process: 

(1) Tyr + ATP -> Tyr-A + PP ; 

(2) Tyr-A + tRNA'n -> Tyr-tRNA T n + AMP 

In step 1, tyrosine (Tyr) is activated by ATP to yield enzyme-bound 
tyrosyl adenylate (Tyr-A), with the concomitant formation of pyrophos¬ 
phate (PP;). In step 2, Tyr-A is hydrolyzed by the free 3' hydroxyl of the 
incoming tRNA molecule, so the tyrosine moiety becomes attached to the 
tRNA, and AMP is released. Both of these reactions take place while the 
substrates are bound to tyrosyl-tRNA synthetase. 

The three-dimensional structure of tyrosyl-tRNA synthetase from B. 
stearothermophilus had already been determined, and the active site had 
been mapped. With the aid of computer graphics, predictions were made 
about the effects of changing one or more amino acid residues in the active 
site on the interaction of the enzyme with the reaction substrates. To test 
whether these predictions were correct, the gene for tyrosyl-tRNA syn¬ 
thetase was specifically modified by oligonucleotide-directed mutagenesis. 
A threonine residue at position 51 (Thr-51) was replaced with either an 
alanine or a proline residue. In the native enzyme, the hydroxyl group of 
Thr-51 forms a long hydrogen bond with the ring oxygen of the ribose 
moiety of tyrosine adenylate. It was deduced that the removal of this weak 
hydrogen bond might improve the affinity of the enzyme for ATP. 


BOX 8.1 


An Overview of Enzyme 
Kinetics 

I n its simplest form, an enzyme-cata¬ 
lyzed reaction may be described by 
the equation 

E + S <-+ES -> E + P (1) 

where the symbols represent concen¬ 
trations, E is the enzyme not bound to 
substrate, S is the unbound substrate, 
ES is the enzyme-substrate complex, 
and P is the product of the enzyme- 
catalyzed reaction. The interaction of 
E with S to form ES is controlled by 
the forward rate constant k u the disso¬ 
ciation of ES to E and S is controlled 
by k_ i, and the formation of P from ES 
is controlled by k 2 . Thus, the overall 
rate of an enzyme-catalyzed reaction 
may be described as follows: 

dP/dt = v = k 2 ES (2) 

Since it is usually quite difficult to 
directly measure the concentration of 


ES, it is necessary to express the rate 
of the enzyme-catalyzed reaction in 
terms of parameters that can be 
readily quantified. The system may 
also be described, in part, by the fol¬ 
lowing equations: 

E 0 = E + ES (3) 

S 0 = S + ES + P (4) 

v = - dS/dt = k 2 S - k^ES + k 2 ES (5) 

where E„ and S 0 are the total amount 
of enzyme and substrate in the 
system, respectively. From equations 
3, 4, and 5, it is possible to derive an 
expression for the reaction rate in 
terms of measurable parameters. 

Thus: 

u = V max S/(S + KJ (6) 

and 


Wiax - TlJ^cat (7) 

where V max is the maximum rate that a 
particular enzyme-catalyzed reaction 
can attain, K m is the dissociation con¬ 


stant of the ES complex (generally 
called the Michaelis constant), and fc cat 
is the catalytic rate constant. Each of 
these values is dependent upon condi¬ 
tions, including the temperature, pH, 
and salt concentration. From these 
equations, it becomes apparent that 
the greater the value of V mal[ or fc cat , the 
higher the rate of the reaction. 
Conversely, the greater the value of 
K m , the lower the rate of the reaction. 
Despite differences in mechanisms of 
action, all enzyme-catalyzed reactions 
can be characterized by values for P max 
(or fc cat ) and K m under a particular set 
of conditions. This makes it easy for 
scientists to rapidly compare the 
behaviors of different enzymes and 
substrates. For example, an enzyme 
with a K m of 10 mM for a particular 
substrate has a low affinity for that 
substrate, whereas an enzyme with a 
K,„ of 1 pM has a high affinity for a 
particular substrate. 
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The resultant enzyme variants were characterized by determining their 
kinetic constants, and some of the observed changes were more dramatic 
than anticipated (Table 8.4). For the Ala-51 variant, the binding affinity (K m ) 
of the enzyme for ATP increased approximately twofold, without any sig¬ 
nificant change in the catalytic rate constant ( k cat ). By contrast, the enzyme 
that contained a proline in position 51 bound ATP more than 100-fold more 
tightly than did the native enzyme. The catalytic efficiency (k cat /K m ) of the 
aminoacylation reaction was increased with both of the variants. The result 
obtained with the Pro-51 variant was unexpected, because theoretically, the 
addition of a proline residue should distort, at least locally, the a-helical 
polypeptide backbone found in this region. Thus, one might have expected 
that this conformational change would reduce substrate binding. 

This study demonstrates that, even though it may be difficult to predict 
the precise effect that a particular amino acid change will have on the reac¬ 
tion kinetics, this approach can correctly identify those amino acid side 
chains that might be altered to improve the kinetic behavior of an enzyme. 
Moreover, as a consequence of these kinds of experiments, it is clear that 
the affinity of an enzyme for its substrate, as well as the catalytic efficiency 
of the enzyme-catalyzed reaction, can be improved by in vitro manipula¬ 
tion of a cloned gene. 

Endoprotease. It is often difficult to engineer an enzyme that has a high 
level of enzymatic activity for one reaction to perform a new and different 
function that is also at a high level. The enzyme amino acid residues that 
are optimized for the original activity may interfere with the ability of the 
enzyme to function optimally for the new activity. To remedy this, one 
group of researchers argued that, following mutagenesis, it is necessary to 
simultaneously select for the new/modified activity and against the orig¬ 
inal enzyme activity. This approach was then used to modify a target gene 
encoding an endoprotease specific for cleaving between adjacent arginine 
residues. The gene was first subjected to error-prone PCR, and the modi¬ 
fied genes were then cloned so that the encoded modified enzyme was 
fused to a peptide that displayed the construct on the surfaces of £. coli cells 
(Fig. 8.19A). Two different substrates were then added. The first substrate 
was a peptide that contained three arginine residues, each with a positive 
charge, and two fluorescent dyes, one near each end of the peptide (Fig. 
8.19B). The second substrate consisted of a peptide that also contained 
three arginine residues but only one fluorescent dye near its C terminus. 
The positive charges guaranteed that the substrate molecules and their 
cleavage products would become associated with the negatively charged 
surfaces of the E. coli cells displaying the modified enzymes. The protease 


TABLE 8.4 Aminoacylation activity of native (Thr-51) and 
modified (Ala-51 and pro-51) tyrosyl-tRNA synthetases 


Enzyme 

k c Js~') 

K m ( mM) 

k c JKJs-'M-') 

Thr-51 

4.7 

2.5 

1,860 

Ala-51 

4.0 

1.2 

3,200 

Pro-51 

1.8 

0.019 

95,800 


Adapted from Wilkenson et al., Nature 307:187-188, 1984. 

The units for K m , the binding constant of the enzyme for ATP, are milli- 
molar units (mM); the units for k cai/ the catalytic rate constant, are reciprocal 
seconds (s -1 ); and the units for k cai /K m , the catalytic efficiency, are s _1 M -1 . 
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After cleavage 




FIGURE 8.19 (A) E. coli displaying a cloned endoprotease on its surface. The negative 
charge on the surface of the bacterial cell is shown. (B) Positively charged peptide 
substrates before and after cleavage containing an Ala-Arg bond and two different 
fluorescent markers (shown in red and green) (substrate 1) and containing an 
Arg-Arg bond and one fluorescent marker (substrate 2). The arrows indicate the 
point of cleavage. 


activity being selected for cleaves a peptide bond between arginine and 
alanine, amino acid residues that are located in the middle of substrate 1. 
The activity being selected against cleaves a peptide bond between argi¬ 
nine and arginine, amino acid residues found in the middle of substrate 2. 
The cleavage of substrate 1, the new activity being selected for, results in 
the production of a peptide with three positive charges and carrying a 
green fluorescent dye (Fig. 8.19B). This peptide binds to the surfaces of cells 
and is readily detectable using a fluorescent cell sorter. The cleavage of 
substrate 2, by the original enzyme activity, results in the production of a 
peptide with two positive charges carrying a red fluorescent dye (Fig. 
8.19B) that also binds to cells and is detectable using a fluorescent cell 
sorter. Following cell sorting, cells that showed both an increase in green 
fluorescence and a decrease in red fluorescence were isolated and tested 
further. Cells that displayed this activity encoded the modified enzyme 
within their plasmid DNA. In this way, one clone was isolated that dis¬ 
played a >3 million-fold selectivity for cleavage of Ala-Arg bonds com¬ 
pared to Arg-Arg bonds, an enormous change in substrate specificity 
without any significant decrease in the high catalytic activity used to cleave 
the original substrate. 

























316 


CHAPTER 8 


Modifying Metal Cofactor Requirements 

Subtilisins are a group of serine proteases that are secreted into growth 
medium by gram-positive bacteria and are widely used as biodegradable 
cleaning agents in laundry detergents. All subtilisins bind tightly (affinity 
constant [KJ = ~10 7 M) to one or more molecules of calcium per molecule 
of enzyme. Calcium binding stabilizes the enzyme. Unfortunately, since 
subtilisins are used in industrial settings where there are a large number of 
metal-chelating agents that can bind to and effectively remove calcium, 
these enzymes are rapidly inactivated under these conditions. To circum¬ 
vent this problem, it is necessary first to abolish completely the ability of a 
subtilisin to bind calcium and then to attempt to increase the stability of 
this modified enzyme in the absence of bound calcium. 

The starting point for the development of a modified subtilisin was the 
isolated subtilisin ESPN' gene from Bacillus amyloliquefaciens. Prior to this 
work, the subtilisin ESPN' protein had been well characterized, and its high- 
resolution X-ray crystallographic structure had been determined. 
Oligonucleotide-directed mutagenesis was used to construct a mutant 
form of the gene for this enzyme by deleting the nucleotides encoding the 
portion of the protein—amino acid residues 75 to 83—that is responsible 
for binding to calcium (Fig. 8.20). The protein without this stretch of amino 
acids does not bind calcium and, surprisingly, retains an overall conforma¬ 
tion that is similar to that of the native form. 

The next steps in the development of a stable subtilisin from one that 
lacked a calcium-binding domain entailed determining which sites might 
contribute to stability and which amino acid residues should be placed at 
these sites. The researchers assumed that any of the amino acids that had 
previously interacted with the calcium-binding loop in the native form of 
the enzyme were potential candidates for change. In total, 10 amino acids 
were considered to be candidates for modification. Moreover, since it was 
not known a priori which particular amino acid residues might best con¬ 
tribute to stabilizing the enzyme molecule, random mutagenesis was used 
for each of these sites. 

The amino acids selected for mutagenesis came from four separate 
regions of the protein: the N terminus (residues 2 to 5), the omega loop 
(residues 36 to 44), an a-helical region (residues 63 to 85), and a (3-pleated 
region (residues 202 to 220). To identify the best amino acid at a particular 
position, mutant clones were grown in the wells of microtiter plates, heated 
to 65°C for 1 hour, allowed to cool, and then assayed for subtilisin activity. 
It was necessary to express the active calcium-free subtilisin in Bacillus sub- 
tilis because it was lethal when expressed in E. coli. 

After the initial screening, stabilizing mutations were identified at 7 of 
the 10 positions that were examined (Table 8.5). When these stabilizing 
mutations were combined into a single gene, the enzyme that was pro¬ 
duced had kinetic properties that were very similar to those of the native 
form of subtilisin. Moreover, the modified form of subtilisin was nearly 10 
times more stable than the native form of the enzyme in the absence of 
calcium and, surprisingly, about 50% more stable than the native enzyme 
in the presence of calcium. Although this work was somewhat labor-inten¬ 
sive and painstaking, it demonstrated that complex properties of enzymes 
that involve a large number of different amino acid residues can be geneti¬ 
cally engineered. 
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FIGURE 8.20 Genetic engineering of calcium-independent subtilisin. The native cal¬ 
cium-containing enzyme is highly active but loses almost all of its activity when the 
loop that binds the calcium is deleted. After several rounds of random mutagenesis, 
mutants of the deleted enzyme, each with stabilizing mutations and a low level of 
activity, are selected. Several of these mutations are combined into a single deriva¬ 
tive with the result that a subtilisin that does not require calcium and that has a high 
level of activity is produced. 


Decreasing Protease Sensitivity 

Streptokinase, a 47-kilodalton (kDa) protein produced by pathogenic 
strains of Streptococcusbacteria, is ablood clot-dissolving agent. Streptokinase 
forms a complex with plasminogen that results in the conversion of plas¬ 
minogen to plasmin, the active protease that degrades fibrin in the blood 
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TABLE 8.5 Effects of random mutations of selected amino acid residues 
on the stability of subtilisin BPN lacking a calcium-binding domain 


Region of protein 

Amino acid 
residue 

Stabilizing 

mutation 

Fold increase 
in half-life 

N terminus 

2 

Gin—>Lys 

2.0 


3 

Ser—>Cys 

17.0 


4 

None found 

None 


5 

Pro—>Ser 

1.2 

Omega loop 

41 

Asp—> Ala 

1.2 


44 

Lys—>Asn 

1.2 

a-Helix 

73 

Ala—>Leu 

2.6 


74 

None found 

None 

(3-Pleat structure 

206 

Gin—>Cys 

17.0 


214 

None found 

None 


Adapted from Strausberg et al., Bio/Technology 13: 669-673, 1995. 

The mutations at positions 3 and 206 to Cys occur in the same clone and provide such 
a high level of stability because of the formation of the disulfide bridge between these 
residues. 


clot. Unfortunately, plasmin also rapidly degrades streptokinase, making it 
necessary for medical personnel to administer streptokinase as a 30- to 
90-minute infusion so that a sufficient level of intact and active streptoki¬ 
nase is maintained. Since it is essential that individuals suffering a heart 
attack be treated as quickly as possible, a long-lived streptokinase could be 
administered as a single injection before a person is transported to a hos¬ 
pital. This early treatment might contribute to saving the lives of heart 
attack victims by quickly restoring blood flow and limiting damage to 
heart muscles. 

Plasmin is a trypsin-like protease that specifically cleaves the peptide 
bond after a lysine or arginine residue. Plasmin rapidly digests the 
414-amino-acid streptokinase protein by cleaving it at lysine 59, near the N 
terminus, and at lysine 386, near the C terminus. The 328-amino-acid pep¬ 
tide that remains following the digestion by plasmin has approximately 
16% of the activity of intact streptokinase in activating plasminogen, and it 
is slowly degraded by plasmin until no activity remains. To make strepto¬ 
kinase less susceptible to proteolysis by plasmin, the lysine residues at 
positions 59 and 386 were changed to glutamine by site-specific mutagen¬ 
esis (Fig. 8.21). Glutamine was chosen to replace lysine because the length 
of its side chain is similar to that of lysine, so that the three-dimensional 
structure would not be disturbed, and because glutamine does not have a 
positive charge. Both single mutants, as well as the double mutant, had the 
same ability to bind to and activate plasminogen as did the native form of 
streptokinase. At the same time, in the presence of plasmin, the half-lives 
of all three mutants were increased compared with native streptokinase, 
with the double mutant being approximately 21-fold more protease resis¬ 
tant. This work is an important first step in the development of variants of 
streptokinase with significantly longer half-lives. 

Modifying Protein Specificity 

Fokl endonuclease. Although protein-engineering studies using oligonu¬ 
cleotide-directed mutagenesis have focused on modifying and enhancing 
existing properties of specialized enzymes, it is conceivable that an enzyme 
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FIGURE 8.21 Schematic representation of protease (plasmin) sensitivity of streptoki¬ 
nase and some engineered plasmin-resistant derivatives. The green circles indicate 
positively charged lysine residues where plasmin cuts the peptide chain. The red 
circles indicate glutamine residues where plasmin does not cut the peptide chain. 
The horizontal arrows indicate plasmin digestion of streptokinase. The protein size 
and activity following plasmin digestion are indicated for each derivative. (A) 
Native protein; (B) the derivative in which glutamine replaces lysine 386; (C) the 
derivative in which glutamine replaces lysine 59; (D) the derivative in which glu¬ 
tamine residues replace lysine 59 and lysine 386. 


could be redesigned to produce a unique catalytic entity. For example, new 
site-specific endonucleases have been created from the relatively nonspe¬ 
cific enzyme FokI endonuclease. 

To date, more than 2,500 restriction-modification enzymes from a 
large number of different organisms have been discovered. Since many of 
these enzymes recognize the same DNA sequence, there are only about 
200 different recognition sites, with the vast majority of the known 
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enzymes recognizing DNA sequences that are 4 to 6 base pairs (bp) in 
length. Restriction enzymes that recognize 4 to 6 bp cut DNA relatively 
frequently and are not as useful for producing large DNA fragments as are 
restriction endonucleases that recognize DNA sequences that are 8 bp or 
longer, i.e., rare cutters. Since the discovery of new restriction enzymes is 
tedious, time-consuming, and unlikely to yield many new enzymes that 
recognize DNA sequences that are 8 bp or longer, an alternative means of 
obtaining new rare cutters is to develop them by protein engineering. 

The class of proteins that contain unique structural domains that each 
bind a molecule of Zn 2+ are called zinc finger proteins. These proteins bind 
to DNA in a sequence-specific manner by inserting a protein a-helical 
region into the major groove of the DNA double helix. For example, a 
mouse protein called Zif268 has three separate zinc finger domains, and 
each zinc finger interacts with a specific DNA triplet codon. Moreover, 
since the zinc fingers bind to the DNA independently of one another, they 
can be linked together in a peptide by genetic engineering so that they will 
bind to a predetermined site on a DNA fragment. Thus, it is possible to 
engineer nucleases that can cut DNA at specific sites by fusing several zinc 
finger-encoding sequences with the portion of the gene for the nonspecific 
nuclease FokI from the bacterium Flnvobacterium okeanokoites. To test this 
idea, a hybrid gene that included six consecutive histidine residues at the 
N terminus to facilitate purification of the fusion protein, three zinc fingers, 
a (Gly 4 Ser 3 ) linker peptide to confer flexibility on the fusion protein, and 
the portion of the FokI gene that encodes the nuclease activity was con¬ 
structed (Fig. 8.22). After purification of the expressed protein, the 
N-terminal histidine residues may be removed by treatment with 
thrombin. 

Bacteria that produce restriction enzymes protect their own DNA from 
being cleaved by these restriction enzymes by synthesizing enzymes that 
bind to and methylate the restriction enzyme recognition sites on the DNA. 
Flowever, a host genome would not be protected from digestion by the 
synthetic FokI hybrid restriction endonuclease. Consequently, during cell 
growth, the expression of the hybrid enzyme was prevented by placing it 
under the control of the bacteriophage T7 expression system. 

Two FokI hybrid restriction endonucleases, each designed to cleave 
bacteriophage X DNA at different single sites, were produced. One cleaved 
bacteriophage X DNA at its target site, and the other cleaved X at the 
expected site and to a lesser extent at two other sites. The latter result is not 
entirely surprising, since zinc fingers recognize triplet codons by inter¬ 
acting primarily with two of the three bases. Thus, although these hybrid 
enzymes are not yet ready for routine laboratory use, the strategy of using 
zinc finger motifs and a nuclease domain to create unique restriction endo¬ 
nucleases appears promising. 

Antibodies. With antibody molecules, large portions of the protein are 
identical from one antibody to the next. Flowever, a small portion of the 
amino acids in the peptide chain are hypervariable, giving an antibody 


FIGURE 8.22 Gene construct for a zinc finger-Fokl hybrid restriction endonuclease. 
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molecule a high degree of specificity for the antigenic determinant to which 
it binds (see chapter 10). By modifying these hypervariable regions of the 
protein, it should be possible to generate antibodies in vitro that are 
directed against a wide range of antigenic determinants. The portion of an 
antibody molecule that contains the ability to bind to an antigenic determi¬ 
nant is sometimes called a Fab fragment; it can bind in the absence of the 
rest of the antibody molecule and consists of two peptides each with dif¬ 
ferent hypervariable complementarity-determining regions (CDRs) sepa¬ 
rated by relatively invariant framework regions (Fig. 8.23). Together, the six 
CDRs (three from the variable part of the light chain and three from the 
variable part of the heavy chain) determine the specificity of an antibody 
molecule. Consequently, altering one or more of the amino acid residues in 
one of the CDRs changes the specificity of the antibody. 

Using random mutagenesis with degenerate (mixed) oligonucleotide 
primers, it was possible to introduce a range of different mutations into the 
three CDRs of the variable region of an antibody heavy-chain gene (Fig. 

8.24). First, one of the CDRs was modified by PCR. Then, in a second PCR, 
the other two CDRs were modified. Finally, the three altered CDRs were 
combined in a single DNA fragment. With this approach, the changes could 
have just as easily been introduced into the gene for the variable portion of 
an antibody light chain. In one instance, a Fab fragment of a monoclonal 
antibody that was specific for the compound 11-deoxycortisol was altered 
as described above to produce a Fab fragment that was specific for cortisol 
and no longer bound 11-deoxycortisol. Depending on the method used to 
screen the library of mutagenized Fab genes, this approach can facilitate 
the creation of Fab fragments directed toward any antigenic determinant. 

Increasing Enzyme Stability and Specificity 

tPa. The enzyme tissue plasminogen activator (tPA) is a multidomain 
serine protease that is medically useful for the dissolution of blood clots. 
Flowever, like streptokinase, tPA is rapidly cleared from the circulation, so 
that it must be administered by infusion. Therefore, to be effective with this 
form of delivery, high initial concentrations of tPA must be used. 
Unfortunately, under these conditions, tPA can cause nonspecific internal 
bleeding. Thus, a long-lived tPA that has an increased specificity for fibrin 
in blood clots and is not prone to induce nonspecific bleeding would be 
desirable. It was found that these three properties could be separately 
introduced by directed mutagenesis into the gene for the native form of 
tPA. First, changing Thr-103 to Asn causes the enzyme to persist in rabbit 
plasma approximately 10 times longer than the native form. Second, 

FIGURE 8.23 Structure of a Fab molecule. FR, framework region. C H1 and C L are con¬ 
stant domains from the heavy and light chains of the antibody molecule, respec¬ 
tively. The N-terminal (NH 2 ) and C-terminal (COOH) ends of each polypeptide, as 
well as a disulfide bridge (-S-S-), are indicated. 
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FIGURE 8.24 Schematic representation of the method used to introduce mutations 
into the three CDR genes of the variable region of a heavy antibody chain. The 
framework region sequences are in green, and the CDR sequences are in blue. (A) 
The first PCR introduces random mutations into the DNA encoding CDR1. (B) The 
second PCR introduces random mutations into the DNA encoding CDR2 and 
CDR3. (C) The third PCR combines the DNA that was amplified in panels A and B. 
The circled portion of the DNA indicates the place where random mutations were 
introduced. 


changing amino acids 296 to 299 from Lys-His-Arg-Arg to Ala-Ala-Ala-Ala 
produces an enzyme that is much more specific for fibrin than is the native 
form. Third, changing Asn-117 to Gin causes the enzyme to retain the level 
of fibrinolytic activity found in the native form. Moreover, combining these 
three mutations in a single construct allows all three activities to be 
expressed simultaneously (Table 8.6). Additional work is needed to deter¬ 
mine whether a modified form of tPA is an acceptable replacement for 
native tPA. 

Fructosyl-amino acid oxidase. Glycation, the nonenzymatic addition of 
glucosyl residues on the surfaces of blood proteins, such as hemoglobin 
and albumin, is typically increased in diabetics with high blood glucose 
levels. Also, since the glycation of blood proteins is not affected by the 
changes in blood glucose levels that occur following food intake, the 
levels of glycated proteins serve as good indices for monitoring diabetes 
patients during therapy. In particular, the hemoglobin Ale (HbAlc) value, 
an index of the medical condition of diabetes patients, measures the 
amount of the valine residue at the amino end of the hemoglobin (3-subunit 
that is glycated. One way to measure HbAlc employs the enzyme fruc¬ 
tosyl-amino acid oxidase isolated from a strain of Corynebacterium. This 
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TABLE 8.6 Stabilities and activities of various modified versions of tPA 


tPA 

variant 

Modification(s) 

Stability 
in plasma 

Fibrin 

binding 

Activity 
in plasma 

Activity 
vs. clots 

1 

Thr(103)—>Asn 

10 

0.34 

0.68 

0.56 

2 

LysHisArgArg(296-299)—>AlaAlaAlaAla 

0.85 

0.93 

0.13 

1.01 

3 

Thr(103)—>Asn,KysHisargArg(296-299)—>AlaAlaAlaAla 

5.3 

0.33 

0.13 

0.65 

4 

Thr(103)—xAsn, Ans(117)-»Gln 

3.4 

1.0 

1.13 

1.17 

5 

LysHisArgArg(296-299)—>AlaAlaAlaAla, Ans(17) Gin 

1.2 

1.33 

0.16 

1.38 

6 

Thr(103)—>Asn, LysHisArgArg(296-299)->AlaAlaAlaAla, Asn(117)->Gln 

8.3 

0.87 

0.06 

0.85 


Adapted from Keyt et al., Proc. Natl. Acad. Sci. USA 91:3670-3674,1994. 

All of the values shown are normalized to the wild type. Plasma stability is the reciprocal of the time it takes for plasma clearance, larger numbers indicate a 
more stable derivative. Fibrin specificity is reflected by a high activity versus clots and a lows activity in plasma. 


enzyme is specific for D-fructosyl-L-valine and does not have any activity 
toward N e - true tos y I -1 -1 v s i n e, the glycated amino acid associated with 
serum albumin. Thus, the high level of substrate specificity of the 
Cory neb acterium enzyme is essential for the accurate enzymatic determina¬ 
tion of the amount of HbAlc. The one problem with the Cory neb acterium 
enzyme is that it is relatively unstable. To remedy this problem, £. coli cells 
were transformed with a plasmid containing the Corynebacterium sp. gene 
for fructosyl-amino acid oxidase. The transformed E. coli strain was then 
subjected to repeated rounds of in vivo mutagenesis and screening for 
stable enzyme activity by measuring the enzyme activity following a 
10-minute incubation of the enzyme at a temperature of 47°C. In each 
round of in vivo mutagenesis, cells were grown in culture and 10,000 cells 
were plated on medium containing D-fructosyl-L-valine. The fructosyl- 
amino acid oxidase gene from the most stable clone was isolated and 
sequenced, and that clone became the starting point for the next round of 
in vivo mutagenesis. As can be seen in Table 8.7, each successive round 
produced a more stable enzyme with additional amino acid changes. 
Importantly, in each round, the enzyme activity remained essentially 
unchanged; in fact, the mutant selected following four rounds of in vivo 
mutagenesis had slightly increased activity compared to the starting 
enzyme. Thus, this simple directed-evolution procedure resulted in an 
enzyme with significantly increased stability and the possibility of greater 
practical utility. 

Enteropeptidase. The enzyme enteropeptidase (sometimes referred to as 
enterokinase) is a membrane-bound serine protease from the duodenal 
mucosa consisting of two polypeptide chains that converts the inactive 


TABLE 8.7 Increasing thermostability of fructosyl-amino acid oxidase with increasing rounds of in vivo mutagenesis 


Round no. 

% Activity 
remaining after 
10 min at 47 °C 

Changes to wild-type amino acid sequence 

0 

1 

None 

1 

30 

Ala-188—>Gly-188, Met-244^Leu-244 

2 

70 

Ala-60—>Thr-60, Ala-188^Gly-188, Met-244^Leu-244 

3 

80 

Ala-60—>Thr-60, Ala-188^Gly-188, Met-244^Leu-244, Leu-261-^Met-261 

4 

90 

Ala-60—>Thr-60, Ala-188^Gly-188, Met-244^Leu-244, Asn-27^Ser-257, Leu-261^Met-261 
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His-His-His-His-His-His-Asp-Asp-Asp-Asp-Lys 



His tag EP recognition site Target protein 


FIGURE 8.25 A target protein with an N-terminal polyhistidine tail, which may be 
specifically excised by digestion with an enteropeptidase (EP) that recognizes the 
Asp-Asp-Asp-Asp-Lys sequence. 


precursor trypsinogen to active trypsin. The bovine or porcine version of 
this enzyme is often used to excise polyhistidine tags from recombinant 
proteins produced in E. coli. This is achieved by including a recognition 
site for enteropeptidase (Asp-Asp-Asp-Asp-Lys) between the polyhisti¬ 
dine tag and the target protein (Fig. 8.25). While this strategy works rea¬ 
sonably well, the enteropeptidases that are currently used to remove 
polyhistidine tags digest proteins at other amino acid sequences to a sig¬ 
nificant extent, leading to varying amounts of hydrolysis of the target 
protein. While this may not be a significant problem when proteins are 
purified on a laboratory scale, on an industrial scale, anything that lowers 
the final yield of the target protein is a problem. Therefore, researchers 
have sought to identify enzymes that recognize the same site as bovine 
and porcine enteropeptidases but lack their nonspecific peptidase activity. 
One group of researchers isolated and expressed cDNA encoding entero¬ 
peptidase from the medaka (Oryzias latipes), a freshwater teleost fish. The 
enzyme encoded by this organism has a level of activity comparable to 
those of the bovine and porcine enzymes toward synthetic substrates con¬ 
taining the Asp-Asp-Asp-Asp-Lys sequence and only about 1/10 the level 
of activity of the bovine and porcine enzymes against peptide substrates 
lacking this sequence. To better understand the strict specificity of the 
medaka enzyme compared to the mammalian enzymes, several mutations 
were created in the gene for the medaka enzyme by site-specific mutagen¬ 
esis. Amino acid residues that were conserved in four different mamma¬ 
lian versions of the enzyme but not in the medaka enzyme were identified 
at five different sites. These sequences were altered in the medaka enzyme 
to reflect the amino acids used in the mammalian enzymes. When these 
mutant enzymes were tested, one had about 90% of the activity of the 
native medaka enzyme against synthetic peptides containing the Asp- 
Asp-Asp-Asp-Lys sequence and as little as a fifth of the medaka native 
enzyme's activity against peptides lacking this sequence. In other words, 
one of the mutant enzymes was significantly better at reducing unwanted 
nonspecific activities than the native enzyme. This mutant was tested with 
several different fusion proteins with a histidine tag and an Asp-Asp-Asp- 
Asp-Lys sequence. In all cases, the enzyme efficiently excised the histidine 
tag and Asp-Asp-Asp-Asp-Lys sequence without degrading the target 
protein to any detectable extent. This was in marked contrast to the mam¬ 
malian enzymes, which in all cases significantly degraded the target pro¬ 
teins. Thus, by changing a single amino acid residue, it was possible to 
generate an enzyme with an altered specificity that was more beneficial 
for biotechnological processes. 
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Altering Multiple Properties 


Shuffled DNA clone bank 


Subtilisin. Generally, directed mutagenesis has been used to alter a single 
property of a protein. However, changing one property of an enzyme often 
disrupts other important characteristics. One possible solution to this 
problem is the "molecular breeding" of new proteins starting from several 
similar genes. Portions of these genes are recombined by DNA shuffling to 
produce a large number of new proteins. This approach does not require 
any prior knowledge of the structure and function of the target protein. 

The validity of this concept was tested starting with 26 different sub¬ 
tilisin genes isolated from different Bacillus strains. After the DNA was 
shuffled, a library of chimeras was constructed and transformed into B. 
subtilis. The library of 654 clones that was generated represented only a 
small portion of the total number of chimeras that could be produced. The 
enzyme secreted by each clone in the library, as well as each of the 26 
parent enzymes, was assayed (in a microtiter plate) for activity at 23°C, 
thermostability, solvent stability, and pH dependence (Fig. 8.26), traits that 
are useful in one or more industrial applications. Of the 654 clones tested, 
77 produced enzymes that performed as well as or better than the best 
parent strain at 23°C. For each parameter that was tested, a number of 
enzymes with improved performance were detected. When these "supe¬ 
rior" clones were sequenced, all of the genes were chimeric. In one instance, 
one of the chimeric genes included eight crossovers that produced a pro¬ 
tein with 15 amino acid substitutions compared with the most similar 
parent. This library, produced by molecular breeding, contains functional 
genes that are more altered in sequence from the starting sequence than can 
be achieved by multiple rounds of mutagenesis of a single parent. 

Often, the properties that are desirable for use in an industrial process 
do not exist among naturally occurring enzymes because these properties 
are not especially useful in nature. For example, an enzyme that is both 
highly active at 23°C and stable at 70°C is unlikely to have been selected for 
under natural conditions. However, using molecular breeding should 
make it easier to develop enzymes with properties that can be used as com¬ 
ponents of industrial processes. 

Peroxidase. The enzyme peroxidase from the ink cap mushroom Coprinus 
cinereus has been used as a dye transfer inhibitor in laundry detergent. This 
enzyme acts by oxidizing, and therefore decolorizing, free dyes that have 
leached out of clothing, thereby preventing their uptake by other garments. 
Unfortunately, under wash conditions using bleach-containing detergents, 
high pH (10.5), high temperature (50°C), and high peroxide concentration 
(5 to 10 mM) rapidly inactivate C. cinereus peroxidase. To use this enzyme 
as a dye transfer inhibitor in laundry detergent, it is essential that the 
enzyme withstand high pH, temperature, and peroxide levels; therefore, a 
strategy for engineering this enzyme had to be developed. 

While it is a relatively straightforward matter to shuffle DNA and 
isolate hybrid genes with unique properties when two or more similar 
genes are available, this approach cannot be used when starting with a 
single gene. Rather, the most effective strategy is probably either random 
mutagenesis or error-prone PCR. However, one group of researchers uti¬ 
lized an approach that combined either random mutagenesis or error- 
prone PCR with DNA shuffling (Fig. 8.27). Based on their knowledge of 
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FIGURE 8.26 Screening and testing a 
shuffled DNA library of the subtilisin 
gene family. First, the entire library is 
plated in a grid formation on agar 
plates containing 2% skim milk, and 
the clones that generate a zone of clear¬ 
ance because they have digested the 
milk proteins are picked, numbered, 
and cultured. Second, the chosen colo¬ 
nies are grown under different condi¬ 
tions in 96-well microtiter plates before 
a chromogenic substrate is added. 
Enzyme activity is quantified by the 
intensity of the color that is formed. 









FIGURE 8.27 Simplified version of a scheme used to generate mutants in which sev¬ 
eral traits are altered at once. The stars indicate point mutations. The colored 
regions represent segments generated by site-directed or random mutagenesis after 
DNA shuffling. Only a small number of protein variants are shown. 


the three-dimensional structure of peroxidase from C. cinereus, scientists 
used site-directed mutagenesis to replace solvent-exposed amino acids 
with those with nonoxidizable side chains and to introduce stabilizing 
features, such as disulfide bridges, into the protein. To identify other 
areas of the protein that it might be beneficial to change, the gene was 
subjected to error-prone PCR, and several beneficial mutants were iso¬ 
lated. When all of the mutations that were successful in improving the 
properties of the enzyme were combined into a single genetic construct, 
the resultant enzyme had a 114-fold improvement in thermal stability and 
a 2.8-fold improvement in oxidative stability. Unfortunately, although 
these changes were in the right direction, they were insufficient to ade¬ 
quately protect the enzyme against actual wash conditions. Following 
another round of random mutagenesis, the genes of all of the beneficial 
mutants were used as a starting point for a round of DNA shuffling. 
Eventually, an enzyme with a 174-fold increase in thermal stability and a 
100-fold increase in oxidative stability compared with the native enzyme 
was isolated. Not only was the enzyme that was engineered suitable for 
use as a dye transfer inhibitor in laundry detergent, but this approach 
may also be used with a variety of enzymes to improve two or more prop¬ 
erties at the same time. 
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SUMMARY 


T he proper functioning of a protein is due to its conforma¬ 
tion, which is a consequence of its amino acid sequence 
and subunit structure. Certain amino acids in a polypeptide 
chain play important roles in determining the specificity, ther¬ 
mostability, and other properties of a protein. Changing even 
a single nucleotide of the gene encoding a target protein can 
result in the incorporation of an amino acid that can either 
disrupt the normal activity or enhance a specific property of 
the protein. With the emergence of recombinant DNA tech¬ 
nology, it has become possible to replace nucleotides of a 
cloned gene and to produce proteins with specific amino acids 
at defined sites. This procedure is called directed mutagenesis, 
and it can be performed in various ways. To change a partic¬ 
ular amino acid within a protein, the target gene is first sub¬ 
cloned onto bacteriophage M13 DNA. The single-stranded 
form of this bacteriophage is then copied by using an oligo¬ 
nucleotide primer that is designed to introduce a specified 
nucleotide into the target gene. E. coli cells are transformed 
with the double-stranded M13, and some of the M13 bacterio¬ 
phage progeny carry a variant of the cloned gene that contains 
the mutation. These bacteriophages are identified, the altered 
gene is subcloned into an expression vector, and the expressed 


protein is tested for activity. There are also plasmid- and PCR- 
based strategies for introducing these kinds of changes into 
cloned genes. In many instances, the amino acid change(s) 
that might enhance a particular property of a target protein is 
not known a priori. In these cases, random mutagenesis, error- 
prone PCR, or DNA shuffling, rather than oligonucleotide- 
directed mutagenesis, is preferred. 

The choice of which amino acid to change is often based on 
knowledge of the role of a particular amino acid in the func¬ 
tional protein. This knowledge comes from genetic studies or 
X-ray crystallographic data of the three-dimensional organiza¬ 
tion of the protein. Specific sites or regions can be altered, or 
combined, to improve the thermostability, pH tolerance, 
specificity, allosteric regulation, cofactor requirements, and 
other properties of enzymes that are used in industrial pro¬ 
cesses. For example, thermostability has been enhanced by 
changing amino acids at two sites in the enzyme triosephos- 
phate isomerase, and the protease sensitivity of streptokinase 
has been decreased by changing two lysine residues to glu¬ 
tamine. Not only are these approaches helpful in engineering 
new properties for existing proteins, but they can also be used 
to design unique enzymes. 
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REVIEW QUESTIONS 


1. What physical and chemical properties of enzymes are tar¬ 
gets for enhancement by directed mutagenesis? 

2. You have cloned a bacterial gene that is expressed in E. coli, 
and now you want to alter its activity. However, because of 
technical problems with the original M13 protocol, only a very 
small fraction of the mutagenized clones of this gene actually 
carry the modified gene; the vast majority of the clones con¬ 
tain the unaltered form of the gene. How would you perform 
site-specific mutagenesis so that a much larger proportion of 
the clones have the desired mutation? 

3. You have isolated a gene for an enzyme that is expressed in 
E. coli. Describe how you would alter the catalytic activity of 
the enzyme. Assume that you know the DNA sequence of the 
gene but do not know anything about which regions of the 
enzyme are important for catalytic activity. 


4. Discuss the advantages and disadvantages of oligonucle¬ 
otide-directed mutagenesis using either bacteriophage M13 or 
PCR. 

5. Describe a strategy for oligonucleotide-directed mutagen¬ 
esis that uses plasmid DNA containing the gene of interest. 

6. How can degenerate oligonucleotides be used to generate 
random mutations within a cloned DNA fragment? 

7. Describe a strategy for increasing the stability of a protein 
that has (1) no cysteine residues or (2) an odd number of 
cysteine residues. 

8. How might replacing asparagine with another amino acid 
residue affect the stability of a protein? 

9. How can the cofactor requirements of an enzyme be 
altered? 



Directed Mutagenesis and Protein Engineering 329 


10. Describe how you would change the catalytic activity or 
substrate specificity of an enzyme whose gene you have iso¬ 
lated. Why would you want to do this? 

11. What is error-prone PCR, and why is it useful? 

12. Outline two ways in which DNA shuffling may be used to 
generate hybrid genes. 

13. How can unusual amino acids be incorporated into pro¬ 
teins, thereby producing an altered form of the target pro¬ 
tein? 

14. How would you engineer human pancreatic RNase to be 
an antitumorigenic agent? 

15. How would you engineer streptokinase so that it was less 
sensitive to proteolytic digestion? 

16. How can the gene(s) encoding a Fab fragment of a mono¬ 
clonal antibody be modified so that the specificity of the anti¬ 
body is altered? 

17. What is molecular breeding? How can this approach be 
used to simultaneously engineer several properties of a pro¬ 
tein? 


18. Starting from a single enzyme-encoding gene, how can 
DNA shuffling be used to engineer an enzyme in which sev¬ 
eral separate and distinct properties have been modified? 

19. How would you use PCR-amplified oligonucleotide- 
directed mutagenesis to create insertion mutants? Deletion 
mutants? 

20. How would you generate random insertion/deletion 
mutations in a target gene on a plasmid? 

21. What is nonhomologous random recombination, and how 
would you use it to generate modified enzymes? 

22. How would you engineer an enzyme with a high level of 
activity toward one substrate to have a high level of activity 
for a different substrate and a low level of activity toward the 
original substrate? 

23. How would you ensure that an enteropeptidase cleaved 
only at the target site within the protein of interest and 
nowhere else? 
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W ITH THE ADVENT OF RECOMBINANT DNA TECHNOLOGY, many of the 
properties of microorganisms that might be useful in a variety of 
applications can be more readily exploited. In part II, we examine 
some of the uses for genetically engineered microbial systems. 

Currently, bacteria are being genetically manipulated to act as biolog¬ 
ical factories for the production of pharmaceutical proteins, nucleic acid 
therapeutic agents, restriction endonucleases, chemical compounds, amino 
acids, antibiotics, and biopolymers. In some applications, cloned genes 
have been introduced into bacterial host cells to create novel biosynthetic 
pathways that produce novel metabolites. Genes and DNA fragments from 
pathogenic organisms have been isolated and used as probes for the diag¬ 
nosis of disease in both animals and humans. In other instances, isolated 
genes and DNA fragments have been used to produce safer and more effi¬ 
cacious vaccines. 

Genetic manipulation of microbial systems also entails enhancing the 
natural ability of certain bacterial strains to carry out specific biological 
processes. For example, researchers have developed bacterial strains that 
can degrade environmental pollutants, improve the growth of plant crops, 
degrade cellulosic biomass into utilizable low-molecular-weight com¬ 
pounds, and prevent the proliferation of specific insect pests. 

It is often assumed that the growth of large quantities of microbes is a 
routine procedure. Successful large-scale production of proteins synthesized 
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by recombinant microorganisms, however, requires that many different fac¬ 
tors be controlled during both the growth phase of the microorganism and 
the purification process to ensure that high yields of a pure product are 
obtained. 
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T he success of modern medicine and agriculture often depends on the 
ability of workers in these fields to detect the presence of specific 
viruses, bacteria, fungi, parasites, proteins, and small molecules in 
humans, animals, plants, water, and soil. For example, the prevention, con¬ 
trol, or treatment of infectious disease is generally facilitated by the early 
and accurate identification of the causative pathogenic organism. Many of 
these detection procedures require the growth in culture of the potential 
pathogen and then the analysis of a spectrum of physiological properties 
that facilitate its identification. Although tests of this type are effective and 
reasonably specific, they are often slow and costly. These constraints apply 
to the identification of both bacterial and parasitic (Table 9.1) organisms. In 
addition, if the pathogenic organism does not grow well or cannot be cul¬ 
tivated, the opportunity to detect the disease-causing organism is severely 
limited. For example. Chlamydia trachomatis, an obligately intracellular bac¬ 
terium, causes a sexually transmitted disease prevalent in North America 
and Europe. Clinical diagnosis of chlamydial infection is difficult, because 
long-term cell culture is required. Frequently, false-negative results (i.e., the 
diagnosis of the absence of the organism is erroneous) are obtained, and 
consequently, adequate treatment procedures are not implemented. 
Certainly, if growth were required for detection, then at best only a few of 
all known pathogenic organisms could ever be routinely identified. To 
overcome this major constraint, molecular diagnostic procedures using 
either immunological or DNA detection methodologies have been 
devised. 

In general, any useful detection strategy must be specific, sensitive, 
and simple. Specificity means that the assay must yield a positive response 
for only the target organism or molecule. Sensitivity means that the diag¬ 
nostic test must identify very small amounts of the target organism or 
molecule, even in the presence of other potentially interfering organisms or 
substances. Simplicity is required for the test to be run efficiently, effec¬ 
tively, and inexpensively on a routine basis. 
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TABLE 9.1 A comparison of some of the methods used to diagnose parasite infection 


Method 

Advantages 

Disadvantages 

Microscopic examination 

Simple 

Direct detection of parasite 
Differentiates morphologically 
distinct organisms 

Slow, laborious, and tedious 

Low sensitivity 

Cannot discriminate between similar 
organisms 

Requires a high skill level 

In vitro culture and mouse 
inoculation 

Detects only viable parasites 

Measures virulence and infectivity 

Slow and expensive 

Different strains show a range of 
responses 

Parasite may lose its viability in the 
specimen 

Uses animals 

Detection of antibodies in 

serum 

Simple and fast 

Automatable 

Can be used to screen a large number 
of samples 

Not always specific 

Does not distinguish between active 
and latent infections 

DNA hybridization 
and PCR 

Fast, sensitive, and specific 

Detects parasite directly 

Can distinguish different species 
Independent of previous infections 
Parasites need not be viable 

Automatable 

Expensive and multistep 

Does not distinguish between live 
and dead organisms 

Possible false positives and false 
negatives 


Adapted from Weiss, Clin. Microbiol Rev. 8:113-130,1995. 


It is estimated that worldwide sales of immunodiagnostics accounted 
for approximately $7.7 billion in 1999, and this figure continues to increase 
by 5 to 10% per year. The market for DNA-based diagnostic procedures 
was around $500 million in 1999 and is increasing at around 20 to 30% per 
year, so that in 2004 it was worth approximately $2 billion. In this chapter, 
the principles behind some of these molecular diagnostic procedures and 
the use of these procedures for a variety of applications are discussed. 


Immunological Diagnostic Procedures 

Many immunological detection methods are sensitive, specific, and simple. 
They can be used for a wide range of applications, including drug testing, 
assessment and monitoring of various cancers, detection of specific metab¬ 
olites, pathogen identification, and monitoring infectious agents. However, 
there are limitations. For example, if the target is a protein, then the use of 
antibodies requires that the genes contributing to the presence of the target 
site be expressed and that the target site not be masked or blocked in any 
way that would prevent the binding of the antibody. 

In principle, traditional diagnostic procedures for infectious agents rely 
on either a discrete set of traits characteristic of the pathogenic agent or, 
preferably, one unique, readily distinguishable feature. The clinical micro¬ 
biologist searches for the smallest number of biological characteristics that 
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can, with complete certainty, reveal the presence and precise identity of a 
pathogenic agent. For example, some infectious agents produce distinctive 
biochemical molecules. The problem is how to determine when the identi¬ 
fying component is present in a biological sample. Often, such a marker 
molecule can be identified directly in a specialized biochemical assay that 
is very specific for the marker molecule. The problem with this approach is 
that it can potentially lead to a proliferation of highly individualized detec¬ 
tion systems for different pathogenic organisms. A standardized method of 
identifying any key marker molecule, regardless of its chemical nature, is 
preferred. Because antibodies bind with high specificity to discrete target 
sites (antigens), assays based solely on identifying specific antibody- 
antigen complexes have abolished the need to devise a unique identifica¬ 
tion procedure for each particular marker molecule. 

ELISA 

There are a number of different ways to determine whether an antibody has 
bound to its target antigen. The enzyme-linked immunosorbent assay 
(ELISA) is one method, and it is frequently used for diagnostic detection. 
The ELISA procedure may be either indirect (Fig. 9.1A) or direct (Fig. 9. IB). 
A generalized indirect ELISA protocol (Fig. 9.1A) has the following steps. 

1. Bind the sample being tested for the presence of a specific molecule 
or organism to a solid support, such as a plastic microtiter plate, 
which usually contains 96 sample wells. Wash the support to remove 
unbound molecules. 

2. Add a marker-specific antibody (primary antibody directed against 
the target antigen) to the bound material, and then wash the support 
to remove unbound primary antibody. 

3. Add a second antibody (secondary antibody) that binds specifi¬ 
cally to the primary antibody and not to the target molecule. Bound 
(conjugated) to the secondary antibody is an enzyme, such as alka¬ 
line phosphatase, peroxidase, or urease, that can catalyze a reaction 
that converts a colorless substrate into a colored product. Wash 
the mixture to remove any unbound secondary antibody-enzyme 
conjugate. 

4. Add the colorless substrate. 

5. Observe or measure the amount of colored product. 

If the primary antibody does not bind to a target site in the sample, the 
second washing step removes it. Consequently, the secondary antibody- 
enzyme conjugate has nothing to bind to and is removed during the third 
washing step, and the final mixture remains colorless. Conversely, if the 
target site is present in the sample, then the primary antibody binds to it, 
the secondary antibody binds to the primary antibody, and the attached 
enzyme catalyzes the reaction to form an easily detected colored product. 
Since secondary antibodies that are complexed with an enzyme are avail¬ 
able commercially, each new diagnostic test requires only a unique primary 
antibody. In addition, several secondary antibody molecules, each with 
several enzyme molecules attached, bind to one primary antibody mole¬ 
cule, thereby amplifying the intensity of the signal. 

With a direct ELISA protocol (Fig. 9. IB), a monoclonal antibody specific 
for the target antigen is first bound to the surface of the microtiter plate. To 
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FIGURE 9.1 Generalized ELISA protocol for detecting a target antigen. The primary 
antibody is often obtained from rabbits that have been immunized with the target 
antigen, while the secondary antibody is from goats immunized with rabbit anti¬ 
bodies. The enzyme (E) is conjugated to the secondary antibody. (A) Indirect ELISA; 
(B) direct ELISA. 

assess the amount of a particular antigen in a sample, the sample is added 
to the well of the microtiter plate and allowed to interact with the bound 
antibody. This is followed by a wash to remove any unbound molecules. 
Then, the primary antibody and the secondary antibody conjugated to an 
enzyme are added, as described above, before the presence of bound 
antigen is visualized. 

The principal feature of an ELISA system is the specific binding of the 
primary antibody to the target site. If the target molecule is, for example, a 
protein, then a purified preparation of this protein is generally used to gen¬ 
erate the antibodies that will be used to detect the target. The resulting 
antibody mixture, which is found in the serum (antiserum) of an inoculated 
animal, usually a rabbit, contains a number of different antibodies that 
would each bind to a different antigenic determinant (epitope) on the target 
molecule. Such a mixture of antibodies is called a polyclonal preparation. 
For some diagnostic assays, the use of polyclonal antibodies has two draw¬ 
backs: (1) the amounts of the different antibodies within a polyclonal prepa¬ 
ration may vary from one batch to the next, and (2) polyclonal antibodies 
cannot be used to distinguish between two similar targets, e.g., when the 
difference between the pathogenic form (target) and the nonpathogenic one 
(nontarget) is a single determinant. However, these problems can be over¬ 
come, because it is now possible to generate an antibody preparation that is 
directed against a single antigenic determinant, namely, a monoclonal anti¬ 
body. Also, despite these drawbacks, diagnostic assays employing poly¬ 
clonal antibodies are widely used for a variety of purposes. 
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Monoclonal Antibodies 

In mammals, a complex set of cellular systems has evolved to protect the 
body from toxic substances and invasion by infectious agents. As part of 
the defensive response, cells of the lymphatic system can be induced to 
produce specific proteins (antibodies) that bind to foreign substances (anti¬ 
gens) and—with the help of other immune system proteins, including the 
complement system—neutralize their biological impact. In response to an 
immunological challenge, each antibody-producing cell synthesizes and 
secretes a single antibody that recognizes with high affinity a discrete 
region (epitope, or antigenic determinant) of the immunizing antigen. 
Because an antigen generally has several different epitopes, normally sev¬ 
eral cells of the immune system each produce a different antibody against 
one of the many epitopes of the antigen. Such a set of antibodies, all of 
which react with the same antigen, is designated a polyclonal antibody 
(Fig. 9.2). 

Early in the 20th century, although the polyclonal nature of antibodies 
was not appreciated, it was realized that antibody specificity could be used 
to prevent infections. Later, antibodies were used as diagnostic agents to 
determine the presence of toxic substances in clinical samples. Unfortunately, 
the effectiveness of a polyclonal antibody preparation varies from batch to 
batch because, in some immunizations, certain antigenic determinants of a 
particular antigen are strong stimulators of antibody-producing cells, 
whereas at other times, the immune system responds more actively to dif¬ 
ferent epitopes of the same antigen. Also, individual animals often respond 
differently to a particular antigen. This variation can affect the abilities of 
different preparations to neutralize antigens because different epitopes 
have different potencies (stimulating abilities). Hence, one batch of poly¬ 
clonal antibody may have a low level of antibody molecules directed 
against a major epitope and not be as effective as a previous antibody 
preparation. 

Consequently, a fundamental objective for the applied use of anti¬ 
bodies, as diagnostic agents or as components of therapeutic agents, was 
to discover how to create a cell line that could be grown in culture and 
that would produce a single type of antibody molecule (monoclonal anti¬ 
body) with a high affinity for a specific target antigen. Such a cell line 
would provide a consistent and continuous source of identical antibody 
molecules. Unfortunately, the B lymphocytes (B cells) that synthesize anti¬ 
bodies do not reproduce in culture. However, it was envisioned that a 
hybrid cell type could be created to solve this problem. This hybrid would 
have the B-cell genetic components for producing antibodies and the cell 
division functions of a compatible cell type to enable the cells to grow in 
culture. It was known that normal B lymphocytes sometimes become 
cancer cells (myelomas) that acquire the ability to grow in culture while 
retaining many of the attributes of B cells. Thus, myeloma cells, especially 
those that did not produce antibody molecules, became candidates for 
fusion with antibody-producing B cells. In the mid-1970s, these ideas 
became reality. 

Formation and Selection of Hybrid Cells 

The initial step leading to the production of a hybrid cell line that produces 
a single antibody entails the inoculation of mice with an antigen. After 



FIGURE 9.2 Schematic representation of 
a target antigen. The surface of the 
antigen depicted has seven (numbered 
1 to 7) different antigenic determinants 
(epitopes). When this antigen is used to 
immunize an animal, each antigenic 
determinant elicits the synthesis of a 
different antibody. Together, the dif¬ 
ferent antibodies that interact with an 
antigen constitute a polyclonal anti¬ 
body directed against that antigen. 
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FIGURE 9.3 The HAT procedure for selecting hybrid spleen-myeloma (hybridoma) 
cells. 


several inoculations over a period of a few weeks, the animals are tested, 
generally using an ELISA or similar system, to determine whether they 
have developed an immune response. If they have, they are killed and their 
spleens are removed, washed, minced, and gently agitated to release indi¬ 
vidual cells, some of which are antibody-producing B cells. The splenic cell 
suspension is mixed with a suspension of myeloma cells that are geneti¬ 
cally defective for the enzyme hypoxanthine-guanine phosphoribosyl- 
transferase (HGPRT ). The combined cell suspensions are mixed with 35% 
polyethylene glycol for a few minutes and then transferred to a growth 
medium containing hypoxanthine, aminopterin, and thymidine (HAT 
medium). 

The polyethylene glycol treatment facilitates fusion between cells. 
Nevertheless, the fusion events are rare and random. There will be 
myeloma cells, spleen cells, myeloma-spleen fusion cells, myeloma- 
myeloma fusion cells, and spleen-spleen fusion cells in the mixture. The 
HAT medium, however, allows only the myeloma-spleen fusion cells to 
grow, because none of the other cell types can proliferate in this medium. 
Unfused spleen cells and spleen-spleen fusion cells cannot grow in any 
culture medium. The HGPRT - myeloma and the myeloma-myeloma fusion 
cells cannot use hypoxanthine as a precursor for the biosynthesis of the 
purines guanine and adenine, which are, of course, essential for nucleic 
acid synthesis. However, they have a second, naturally occurring pathway 
for purine biosynthesis that utilizes the enzyme dihydrofolate reductase. 
Therefore, aminopterin is included in the medium because it inhibits dihy¬ 
drofolate reductase activity. Hence, HGPRT - myeloma and myeloma- 
myeloma fusion cells are unable to synthesize purines in HAT medium, so 
they die (Fig. 9.3). 
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The spleen-myeloma fusion cells survive in HAT medium because the 
spleen cell contributes a functional HGPRT, which can utilize the exoge¬ 
nous hypoxanthine in the medium even though purine production by 
means of dihydrofolate reductase is blocked by aminopterin, and because 
the cell division functions of the myeloma cell are active. Thymidine is 
provided to overcome the block in pyrimidine production that is caused by 
the inhibition of dihydrofolate reductase by aminopterin. About 10 to 14 
days after the fusion treatment, only spleen-myeloma fusion cells have 
survived and grown in the HAT medium. These cells are then distributed 
into the wells of plastic microtiter plates and grown on complete culture 
medium without HAT. 

Identification of Specific Antibody-Producing Hybrid Cell Lines 

The next task is to identify those hybrid cells that produce antibody 
against the immunizing antigen. One common screening procedure uses 
the culture medium, which contains secreted antibodies. The medium is 
collected from the wells that have growing cells and is added to a well of 
another microtiter plate that has been precoated with the target antigen. If 
the culture medium contains an antibody (primary antibody) that recog¬ 
nizes an epitope of the antigen, it will bind to the antigen and not be 
washed away during a subsequent washing step. A second antibody (sec¬ 
ondary antibody) that is specific for mouse antibodies is added to the 
wells of the test plate. It will bind to any primary antibody that is bound 
to the antigen. 

Before its use in the immunoassay, the secondary antibody is conju¬ 
gated to an enzyme that can convert a colorless substrate to a colored com¬ 
pound. The presence of color in one of the test wells indicates that the 
original culture medium contained an antibody that was specific for the 
antigen (Fig. 9.4). If the culture medium does not contain an antibody that 
binds to the antigen, then the first wash will remove the primary antibody. 
Therefore, when the secondary antibody is added, it has nothing to bind to 
and is removed by the second washing step. In a well where such a 
sequence of events occurs, the substrate solution remains colorless. 

Those wells of the original microtiter plate whose media give a posi¬ 
tive (color) response in the immunoassay may contain a mixture of cell 
fusions. These cells are therefore diluted with culture medium and seeded 
into fresh wells to establish cell lines from single cells (clones). After the 
clones have been cultured, their media are tested again to determine 
which cell lines (hybrid spleen-myeloma cells, or hybridomas) produce 
monoclonal antibody molecules that recognize the target antigen. If more 
than one specific hybridoma is isolated, further tests are conducted to 
determine whether the different clones produce antibody against the 
same antigenic determinant. Each monoclonal antibody-producing clone 
can be maintained, more or less indefinitely, in culture. In addition, 
samples can be frozen in liquid nitrogen to provide a source of cells for 
future use. 

Because a monoclonal antibody binds to a single discrete site, the 
specificity of an ELISA protocol can be considerably enhanced by using a 
monoclonal rather than a polyclonal antibody. Many monoclonal antibodies 
have been developed for use as immunodiagnostic agents for a variety of 
compounds and pathogenic organisms (Table 9.2). As an alternative to the 
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FIGURE 9.4 Screening for the production of a monoclonal antibody. Spleen cells from 
a mouse that was immunized with a specific antigen are isolated and fused in cul¬ 
ture with myeloma cells that do not produce antibodies. Fused cells are selected for 
the ability to grow on HAT medium, which contains hypoxanthine, aminopterin, 
and thymidine. Cells that produce a specific antibody to the immunizing antigen 
(hybridomas) are identified by an immunoassay and individually subcultured. A 
hybridoma, which grows in culture and secretes a single type of antibody mole¬ 
cules, is the source of a monoclonal antibody 
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isolation and synthesis of monoclonal antibodies in hybridoma cells in cul¬ 
ture, monoclonal antibodies and parts of antibodies (Fab or Fv fragments) 
directed against a target antigen may be selected and produced in Escherichia 
coli (see chapter 10). 


Biofluorescent and Bioluminescent Systems 

Proteins that naturally fluoresce or luminesce, or that can be easily induced 
to do so, may be used as biological reporters in a variety of ways. For 
example, genes encoding these bioreporter proteins may be used to engi¬ 
neer cells to produce a measurable signal in response to a particular 
chemical or physical agent in their environment. In one version of this 
system, a gene for a fluorescent or luminescent protein is placed under the 
control of a promoter that responds to certain environmental signals so 
that when the promoter is activated, a fluorescent or luminescent signal is 
produced (Fig. 9.5). 

Colored Fluorescent Proteins 

Green fluorescent protein. The 238-amino-acid-long photoprotein green 
fluorescent protein, isolated from the jellyfish Aequorea victoria, fluoresces 
green when it is exposed to ultraviolet light. While many fluorescent dyes 
are phototoxic, the incorporation of green fluorescent protein into cells 
allows intact living cells to be monitored in real time. The use of this 
reporter molecule has revolutionized fluorescence microscopy. Among its 
many uses, researchers have used green fluorescent protein to monitor 
tumor cells in gene therapy protocols, to assess the responses of specific cell 
types to various therapeutic drugs and treatments, to monitor protein- 
protein interactions, and to monitor the fates of individual proteins in dif¬ 
ferent therapies. 

Red fluorescent protein. Following the discovery and subsequent suc¬ 
cessful employment of green fluorescent protein, scientists began to search, 
both in nature and by directed mutagenesis, for other colored fluorescent 
proteins. Having multiple colored fluorescent proteins would enable sev¬ 
eral biological processes to be monitored at the same time. For the practical 
use of these proteins, it is essential that they be both as stable and as bright 
as possible. One problem with many of the naturally occurring colored 
fluorescent proteins is that they often have a tendency to form homodimers 
or homotetramers. Such multimeric structures can adversely influence the 
subcellular localization of the proteins, potentially leading to intracellular 
aggregation and other artifacts. One group of researchers isolated a gene 
for a red fluorescent protein from Discosoma coral and, by means of mul¬ 
tiple random mutations, generated a mutant that existed exclusively as a 
monomer instead of as a tetramer. With each iterative cycle of random 
mutagenesis, proteins that yielded red fluorescence that was both bright 
and stable were selected. The production of monomeric red fluorescent 
protein required 33 mutations. This success notwithstanding, monomeric 
red fluorescent protein had several drawbacks compared to the native 
tetrameric form of the protein, including decreased brightness and reduced 
photostability. 


TABLE 9.2 Targets for diagnostic 

monoclonal antibodies 

Polypeptide hormones 
Chorionic gonadotropin 
Growth hormone 
Luteinizing hormone 
Follicle-stimulating hormone 
Thyroid-stimulating hormone 
Prolactin 

Tumor markers 

Carcinoembryonic antigen 
Prostate-specific antigen 
Interleukin-2 receptor 
Epidermal growth factor receptor 

Cytokines 

Interleukins 1-8 
Colony-stimulating factor 

Drug monitoring 
Theophylline 
Gentamicin 
Cyclosporin 

Miscellaneous targets 
Thyroxine 
Vitamin B 12 
Ferritin 

Fibrin degradation products 
Tau protein 

Infectious disease 
Chlamydia 
Herpes 
Rubella 
Hepatitis B 
Legionella 

Human immunodeficiency virus 
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FIGURE 9.5 Schematic representation of a bacterial cell, engineered to respond to a 
particular environmental compound, producing a detectable fluorescent or biolu- 
minescent signal. 


To address some of the above-mentioned problems with monomeric red 
fluorescent protein, as well as to expand the repertoire of available fluores¬ 
cent proteins, additional rounds of mutagenesis were performed. First, 
DNA encoding 7 amino acids from the N terminus of green fluorescent pro¬ 
tein was added to the gene for red fluorescent protein (Fig. 9.6). Then, DNA 
for the 6 amino acids from the green fluorescent protein C terminus was 
added to the gene for the red fluorescent protein. This construct then 
became the starting point for several additional rounds of random mutagen¬ 
esis and directed evolution. Eventually, seven different monomeric colored 
fluorescent proteins were produced (Fig. 9.7). It is argued that there is no 
one best colored fluorescent protein. Some are brighter than others, some 
are more photostable, and some are more sensitive to changes in pFl. Thus, 
various fluorescent proteins maybe used for different applications. Moreover, 
using two or three of these proteins, it is possible to label several different 
cellular components or cell types at once, thereby increasing the utility of 
this approach. 

Luciferase 

The luciferase enzyme, which catalyzes a light-emitting reaction, may be 
produced by a variety of different organisms, including bacteria, algae, 
fungi, jellyfish, insects, shrimp, and squid. Luciferase genes from bacteria 
are termed lux genes, while those from other organisms—the most widely 
studied and utilized being the firefly—are termed luc genes. The lux system 
includes five genes, luxCDABE, and produces a peak of light at 490 nm. In 
some applications, all five lux genes are utilized as a means of monitoring 
the presence and concentrations of various compounds in the environment, 
such as organic compounds, including phenol, salicylate, benzene, trichlo¬ 
roethylene, ammonia, xylene, toluene, and ethylbenezene, or metals, 
including cobalt, copper, iron, lead, mercury, nickel, and zinc. When all five 
genes are used, the light-generating system does not require the addition of 
any other compounds. Therefore, following the addition of a contaminant- 
containing sample to bacterial cells carrying luxCDABE, a quantifiable 
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amount of bioluminescence is produced within a period of a few minutes 
to no more than a few hours. In some cases, the reporter system includes 
only the luxAB genes. While lux A and luxB are together responsible for 
generating the light signal, this system requires that the substrate decanal 
be added during the assay procedure. 

The firefly luciferase-catalyzed reaction results in the production of 
light at 550 to 575 nm. Moreover, the system requires the addition of the 
low-molecular-weight organic compound luciferin as a substrate for the 
light reaction. Typically, luc genes are used in conjunction with eukaryotic 
cells. 

Microbial Biosensors 

There is a need for methods that can easily and rapidly detect the large 
numbers of potentially toxic compounds that contaminate the environ¬ 
ment. Once the contaminated sites have been identified and their range has 
been delineated, there are a number of highly sophisticated analytical tech¬ 
niques available to identify and quantify specific pollutants. Bacteria that 
are constitutively bioluminescent (i.e., unlike the situation mentioned 
above, the bioluminescence does not need to be induced) are good candi¬ 
dates for pollutant detectors. In the presence of pollutants, the biolumines¬ 
cence decreases, providing a clear indication of the presence of the 
pollutants. Naturally bioluminescent bacteria, such as the marine bacte¬ 
rium Vibrio fischeri, require saline conditions and a particular pH range and 
are therefore not useful for testing terrestrial groundwater. However, struc¬ 
tural genes encoding enzymes that lead to bioluminescence (luxCDABE) 
may be inserted into random sites in the chromosomal DNA of a soil bac¬ 
terium, such as Pseudomonas fluorescens. These genes do not contain a tran¬ 
scriptional promoter, so after insertion into the chromosomal DNA of P. 
fluorescens, the only luminescent colonies (visualized in a darkroom) are 
those in which the lux genes are inserted downstream from a constitutive 
P. fluorescens promoter (without disrupting any important bacterial genes). 
The cells that luminesce to the greatest extent and have a growth rate sim¬ 
ilar to that of the wild-type strain are selected for testing with various 
environmental pollutants. To screen water samples for the presence of 
various pollutants (both metals and organic compounds), a suspension of 
bioluminescent P. fluorescens is mixed with the solution being tested, and 
after a 15-minute incubation together, the luminescence of the suspension 

FIGURE 9.6 Construction of a modified monomeric red fluorescent protein (modi¬ 
fied mRFP). The regions of the gene encoding the N and C termini of the green 
fluorescent protein (GFP) were spliced onto the gene for mRFP following the 
removal of the portion of the mRFP gene encoding the N terminus and the addition 
of an oligonucleotide encoding a short linker peptide (L). 
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FIGURE 9.7 Various colored monomeric fluorescent proteins derived from mono¬ 
meric red fluorescent protein showing their emission wavelengths and color 
maxima following excitation. The colors have been called, from derivative 1 
through 7, honeydew, banana, orange, tomato, tangerine, strawberry, and cherry, 
respectively. Adapted from Shaner et al., Nat. Biotechnol. 22:1567-1572, 2004. 


is measured in a luminometer (Fig. 9.8). When a test sample contains a low 
to moderate level of certain pollutants, the cell luminescence is inhibited, 
presumably because the pollutant directly interferes with bacterial metabo¬ 
lism. Since this procedure is rapid, simple, and inexpensive, it is a good 
first screen for assessing the presence of pollution at a particular site. After 
a positive response with a bacterial biosensor, the actual pollutants can be 
determined by other methods. 

In the United States, it has been estimated that there are approximately 
87,000 different chemical compounds that need to be tested for estrogenic 
activity, i.e., steroid-like activity that can disrupt the endocrine system in 
vertebrates. While a number of different methods exist that could be used 
to test these compounds, they are too slow for this sort of large-scale 
screening. Therefore, scientists have developed a simple and sensitive 
system for the rapid initial screening of these 87,000 compounds. With this 
system, yeast (Saccharomyces cerevisiae) cells have been genetically engi¬ 
neered to produce measurable quantities of light in the presence of 
extremely low levels of estrogenic compounds (Fig. 9.9). Using this method, 
in which luminescence is induced, to test for estrogenic compounds, light 
production could be detected in as little as 1 hour. Moreover, following 6 
hours of incubation, the assay attained maximum bioluminescence when 
the engineered yeast cells were exposed to as little as 5 x 10 11 M 
17[3-estradiol, a common estrogen. Of course, many estrogenic compounds 
required higher concentrations in order to be detected. The main drawback 
of this approach is that the yeast cell wall and transport system facilitate 
the entry of some compounds into the cell and inhibit the uptake of other 
compounds. This can skew the results and in some instances may suggest 
that a compound is not estrogenic when it is unable to efficiently enter 
yeast cells. Nevertheless, this technique is likely to identify a large number 
of estrogenic compounds that were deemed nonestrogenic until they were 
tested with this protocol. 

At the same time that some groups of scientists are working to develop 
and perfect cells as biosensors, others have focused their efforts on auto¬ 
mating these systems. Such an automated system might include genetically 
engineered cells that emit blue-green light (-490 nm) in response to specific 
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FIGURE 9.8 Assaying for the presence of pollutants with genetically engineered bio- 
luminescent P. fluorescens. 


compounds, an environment that sustains the cells, a light-tight enclosure, 
and an integrated circuit for light detection and signal processing. While 
simple prototypes have been constructed, the difficulties associated with 
utilizing living cells remain major impediments. 


Nucleic Acid Diagnostic Systems 

The genetic material of an organism contains the essential information that 
contributes to its various features and characteristics. For example, bacte¬ 
rial pathogenicity may be due to the presence of a specific gene or set of 
genes. Similarly, alteration of a gene may cause an inherited genetic disease 
in humans. In theory, the sequence of nucleotides that contributes to a par¬ 
ticular biological characteristic is a distinctive signature that, if detectable, 
can be used as a definitive diagnostic determinant. 

Nucleic acid hybridization is the basis for rapid and reliable assays. 
The physical basis of these systems is precise nucleotide base pairing and 
hydrogen bonding between one string of nucleotides and a complementary 
nucleotide sequence. A general laboratory nucleic acid hybridization 
scheme is as follows. 

1. Bind single-stranded DNA (the target) to a membrane support. 

2. Add single-stranded labeled DNA (the probe) under appropriate 
conditions of temperature and ionic strength to promote base 
pairing between the probe and the target DNAs. 

3. Wash the support to remove excess unbound labeled probe DNA. 

4. Detect the hybrid sequences that form between the probe and target 
DNA. 





















346 


CHAPTER 9 



FIGURE 9.9 Schematic representation of a strain of the yeast Saccharomyces cerevisiae 
that has been engineered to detect low levels of estrogenic substances in the envi¬ 
ronment. Each yeast cell contains the human estrogen receptor (hER) gene inte¬ 
grated into its chromosomal DNA, one plasmid (pUTK404) that contains 
constitutively expressed bacterial luxCDE genes and the flavin oxidoreductase (frp) 
gene from the bacterium Vibrio harveyi, and one plasmid (pUTK407) that contains 
luxA and luxB genes under the regulatory control of estrogen response elements 
(ERE) and constitutive bacterial promoters (not shown). Following the binding of 
the hER protein to an estrogenic compound, the complex binds to an ERE and acti¬ 
vates luxAB transcription. Expression of the luxAB, luxCDE, and frp genes leads to 
the production of a measurable bioluminescent signal. Adapted from Sanseverino 
et al., Appl. Environ. Microbiol. 71:4455^460, 2005. 


A nucleic acid hybridization diagnostic test has three critical elements: 
probe DNA, target DNA, and signal detection. This type of detection 
system can be both extremely specific and highly sensitive. 

Hybridization Probes 

To be effective, a nucleic acid hybridization probe must have a high degree 
of specificity. In other words, the probe must hybridize exclusively to the 
selected target nucleic acid sequence. False positives (i.e., responses in the 
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absence of the target sequence) and false negatives (i.e., no response when 
the target is present) severely undermine the utility of a diagnostic proce¬ 
dure. Probes can be specific at different organismic levels. They can distin¬ 
guish between two or more species, determine particular strains within a 
given species, or identify differences between genes. Depending on the 
requirements of the test protocol, probes can be DNA or RNA, long (>100 
nucleotides) or short (<50 nucleotides), and chemically synthesized, cloned 
intact genes, or isolated regions of a gene. 

Sequences that might make effective probes can be isolated in a number 
of ways. For example, the DNA from a pathogenic organism can be cut 
with a restriction endonuclease and cloned into a plasmid vector. 
Recombinant plasmids are screened with the genomic DNA from both 
pathogenic and nonpathogenic strains. Those plasmids that contain 
sequences that hybridize only to the pathogenic strain form the basis for 
species-specific, and even strain-specific, probes (Fig. 9.10). Additional 
hybridization tests with DNA from a wide range of organisms are then 
conducted to ensure that the candidate probe sequences do not cross- 
hybridize. Each potential probe is also tested under simulated sample 
conditions, including the presence of mixed cultures, to determine its level 
of sensitivity. It is important to note that knowledge of the genomic 
sequence of a large number of bacterial pathogens (currently several hun¬ 
dred) has facilitated the identification of unique stretches of DNA that 
could be used as probes. 

The ability to perform nucleic acid probe diagnostic assays directly on 
available samples without either additional culturing or time-consuming 
extraction procedures is extremely desirable, especially with clinical speci¬ 
mens. Researchers have successfully used probes that hybridize to target 
DNA from fecal samples, urine, blood, throat washes, and tissue samples 
without extensive DNA purification. If a target sequence is rare in the 
working sample, the polymerase chain reaction (PCR) can be used to 
amplify it. 

Diagnosis of Malaria 

An example of a diagnostic protocol that utilizes a DNA probe as a means 
of detection is the procedure developed for the detection of Plasmodium 
falciparum. This parasite causes malaria, a disease that affects about one- 
third of the world's population. The parasite infects and destroys red blood 
cells, leading to fever and, in severe cases, damage to the brain, kidneys, 
and other organs. Sensitive, simple, and inexpensive methods are required 
to identify the source(s) of the parasite in various localities, to assess the 
progress of eradication programs, and to facilitate early treatment. Currently, 
malarial infections are diagnosed by either microscopic examination of 
blood smears or immunological detection of parasite antigens, effective but 
labor-intensive and time-consuming processes, especially given the large 
numbers of samples that need to be examined. Although immunological 
procedures for Plasmodium detection, such as ELISAs, are rapid and ame¬ 
nable to automation, they do not always discriminate between current and 
past infections, because they are designed simply to detect anti-Plasmodium 
antibodies in the blood of affected individuals. 

A DNA diagnostic procedure that selectively measures only current 
infections, i.e., the presence of DNA-containing organisms, was developed 
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FIGURE 9.10 Overview of the development and use of a DNA hybridization probe. 


by using highly repeated DNA (DNA that is present in many copies) from 
P. falciparum. First, a genomic library of the parasite DNA was screened 
with labeled whole-genome parasite DNA. The most intensely labeled 
hybridizing colonies were selected because they were expected to contain 
repetitive DNA. The DNA from each of the selected colonies was then 
tested for its ability to hybridize with DNA from several other Plasmodium 
species that do not cause malaria. The DNA sequence that was chosen as a 
specific probe hybridized with P. falciparum but not with Plasmodium vivax, 
Plasmodium cynomolgi, or human DNA, despite the fact that P. vivax causes 
a less severe form of malaria. This probe can detect as little as 10 picograms 
of purified P. falciparum DNA or 1 nanogram (ng) of P. falciparum DNA in 
blood. 

More than 100 different DNA diagnostic probes have been isolated and 
characterized for the detection of various pathogenic strains of bacteria, 
viruses, and parasites. For example, probes have been developed for the 
diagnosis of human bacterial infections caused by Legionella pneumophila 
(respiratory failure). Salmonella enterica serovar Typhi (food poisoning), 
Campylobacter hyointestinalis (gastritis), and enterotoxigenic E. coli (gastro¬ 
enteritis). Clearly, this is just the "tip of the iceberg," because in principle, 
nearly all pathogenic organisms can be detected by this procedure. 
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Detection of T. cruzi 

The protozoan parasite Trypanosoma cruzi is the causative agent of Chagas 
disease. In this disease, the parasites invade the liver, spleen, lymph 
nodes, and central nervous system, where they multiply and destroy the 
infected cells. T. cruzi is quite common in Latin America. It is spread by 
insects and is responsible for approximately 50,000 deaths per year. 
Diagnosis of acute Chagas disease is usually made by microscopic exami¬ 
nation of a fresh blood sample. Alternatively, a test that takes a longer time 
but ensures that the parasite has not been overlooked entails feeding a 
patient's blood to uninfected insects and then examining with a micro¬ 
scope the contents of the insects' intestines for parasites about 30 to 40 
days later. Both of these tests are laborious, time-consuming, and costly. 
Chagas disease can also be diagnosed by immunological tests; however, 
these tests are notorious for producing false-positive responses. As a pos¬ 
sible alternative to these less than satisfactory procedures, several PCR- 
based assays have been developed. At present, PCR assays for Chagas 
disease are used as adjuncts to the traditional diagnostic procedures that 
are currently in widespread use. 

In one of the PCR-based assay procedures, a 188-base-pair (bp) DNA 
fragment that is present in multiple copies in the T. cruzi genome but is 
absent from the genomic DNA of several related parasites is the target 
sequence. The presence of the amplified 188-bp DNA fragment is readily 
detected by polyacrylamide gel electrophoresis. In general, with minor 
variations in the methodology, such as the primer sequences, PCR can 
facilitate the detection of a wide range of bacteria, viruses, and parasites. 
Currently, there are several PCR-based diagnostic kits that have been 
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T he development of techniques 
in the early 1980s for rapid, effi¬ 
cient, and inexpensive chemical 
synthesis of DNA oligonucleotides 
opened up the possibility of using 
radiolabeled oligonucleotides as 
hybridization probes for the detection 
of a variety of DNAs, including muta¬ 
tions in human genes. At the time, rel¬ 
atively few human genes or even 
cDNAs had been isolated. Thus, it was 
not always easy to find a homologous 
hybridization probe for a particular 
human gene. Moreover, even if the rel¬ 
evant cDNAs had been isolated, it was 
not possible to discriminate by DNA 
hybridization between wild-type and 
mutant human genes that had only a 


single-base-pair difference when the 
hybridization probe was large (>100 
bp). However, Conner et al. synthe¬ 
sized specific oligonucleotides that 
could recognize either the wild-type 
or the mutant DNA. Moreover, oligo¬ 
nucleotides complementary to the 
sequences of both the wild type and 
the mutant were synthesized so that 
it could be determined whether a 
person was heterozygous, as well as 
whether a person was homozygous. 
For sickle-cell anemia, two 19-base- 
long oligonucleotides were used, one 
complementary to the normal p-globin 
gene ((3 A ) and the other complemen¬ 
tary to the sickle-cell gene ((3 s ). DNA 
from normal individuals ((3 A (3 A ) 


hybridized only with the (3 A probe, 
DNA from individuals with sickle-cell 
anemia ((3 S |3 S ) hybridized only with the 
(3 s probe, and DNA from heterozygous 
individuals (|3 A |3 S ) hybridized with 
both probes. This model system was 
the first demonstration of the feasi¬ 
bility of determining genotypes by 
DNA hybridization and opened up 
the possibility of detecting a range of 
human genetic disorders by hybrid¬ 
ization with oligonucleotide probes. 
This possibility has been realized with 
the development of a large number of 
DNA-based gene mutation tests. 
Although the original strategy has 
been largely supplanted by newer 
techniques, such as PCR and OLA, 
this work was important in estab¬ 
lishing that single DNA base pair 
mutations could be accurately and 
easily detected. 








350 


CHAPTER 9 


approved for use by the U.S. Food and Drug Administration for the detec¬ 
tion and quantitation of human immunodeficiency virus, Mycobacterium 
tuberculosis (the causative agent of tuberculosis), and C. trachomatis. 

Nonradioactive Hybridization Procedures 

In many research laboratories, nucleic acid hybridization is routinely 
detected by labeling the probe with a radioactive isotope, commonly phos- 
phorus-32. High specific activity ensures an excellent signal-to-noise ratio. 
In a standard detection system, a radiolabeled probe is mixed with target 
DNA that is bound to a membrane support. After the support is washed 
free of nonhybridized probe DNA, the presence of radioactivity is deter¬ 
mined by laying the membrane on X-ray film (autoradiography). 

However, phosphorus-32 is short-lived, is potentially dangerous, and 
requires special laboratory equipment for handling and safe disposal, so 
nonradioactive systems for indicating hybrid DNA formation have also 
been developed. The nonradioactive detection systems achieve signal 
amplification by enzymatic conversion of either chromogenic or chemilu¬ 
minescent substrates. Chromogenic substrates change color and chemilu¬ 
minescent substrates give off light when they are converted into a specific 
product by an appropriate enzyme. The signal is detected, in most of these 
systems, by incorporating biotin-labeled nucleotides into the DNA probe 
and following a more or less standard procedure: 

1. The biotin-labeled probe is hybridized to the target DNA (Fig. 
9.11A). 

2. Either avidin, a chicken egg white protein, or streptavidin, a bacte¬ 
rial analogue of avidin, is added (Fig. 9.11B). 

3. A biotin-labeled enzyme, such as alkaline phosphatase or peroxi¬ 
dase, is added (Fig. 9.11C). 

4. Depending on which biotin-labeled enzyme was used in the pre¬ 
vious step, either a chromogenic or a chemiluminescent substrate 
is added, and either the color change or the light produced as a 
consequence of the conversion of substrate into product is mea¬ 
sured (Fig. 9.11D). 

Alternatively, following hybridization with a biotin-labeled probe in 
step 2 above, a streptavidin-enzyme complex with an available biotin¬ 
binding site can be added. 

Both avidin and streptavidin bind very tightly ( K d [dissociation con¬ 
stant] = ~10 -15 ) to biotin; in addition, each of these proteins has four sepa¬ 
rate biotin-binding sites, so a single molecule of avidin or streptavidin can 
bind both a biotin-labeled enzyme and a biotin-labeled probe. Enzymatic 
activity is not impaired by biotin labeling or binding to streptavidin. In 
chromogenic detection systems, the action of the enzyme on the substrate 
creates a colored insoluble dye that remains at the site of the hybrid DNA. 
In chemiluminescent systems, enzymatic alteration of the substrate gener¬ 
ates a product that emits light at the site of the hybrid DNA. 

Nonradioactive systems have other advantages: biotin-labeled DNA is 
stable for at least 1 year at room temperature, devices that detect chemilu¬ 
minescence are as sensitive as those that detect radioactive signals, and 
detection of the emitted light with either X-ray film or a luminometer, or 
scoring of a color change, can be completed within a few hours. The use of 


chemiluminescence, which is more sensitive than chromogenic dyes, is 
becoming the detection signal system of choice for many nucleic acid probe- 
based diagnostic assays. For PCR-based assays, the amplification product 
can be labeled by a fluorescent dye that is bound to the 5' end of each 
primer. A fluorescent compound emits light of a longer wavelength after it 
absorbs light of a shorter wavelength. Fluorescein, which appears green 
under certain wavelengths of light, and rhodamine, which appears red, are 
often used for this purpose. After PCR amplification of a target DNA with 
fluorescence-labeled primers, the primers are separated from the amplifica¬ 
tion product and the presence of the label is detected (Fig. 9.12). If the target 
DNA is not present in the sample, then no fluorescent product will be 


FIGURE 9.11 Chemiluminescent detection of target DNA. (A) A biotin-labeled probe 
is bound to the target DNA. (B) Streptavidin is bound to the biotin molecules. (C) 
Biotin-labeled alkaline phosphatase binds to the streptavidin. (D) Alkaline phos¬ 
phatase converts the substrate into a light-emitting product. B, biotin; AP, alkaline 
phosphatase. 
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observed. This system is not only sensitive, it is also quite rapid, since it is 
not necessary to run a gel to separate the amplified target DNA. 

Molecular Beacons 

A novel nonradioactive method for detecting specific sequences of nucleic 
acids involves using "molecular beacon" probes (Fig. 9.13). A typical molec¬ 
ular beacon probe is 25 nucleotides long. The 15 nucleotides in the middle 
are complementary to the target DNA and are designed so that this single- 
stranded molecule does not does not form a structure in which these nucle¬ 
otides base pair with one another. However, the 5 nucleotides at each end 
are complementary to each other and not to the target DNA. A fluorescent 
molecule (fluorophore) is attached to the 5' end, and a nonfluorescent mol¬ 
ecule (quencher) that can absorb the energy emitted by the fluorophore 
before it fluoresces is attached to the 3' end. In solution at room temperature, 
the conformation of the molecular beacon ensures that the fluorophore and 
quencher are close to one another, and the fluorophore is quenched (does 
not fluoresce). On the other hand, when the 15 middle nucleotides of the 
molecular beacon probe hybridize to a target DNA or RNA sequence, the 
fluorophore and quencher are separated from each other and the fluoro¬ 
phore is not quenched, i.e., it fluoresces. With this procedure, care must be 
taken to maintain the reaction mixture at near-ambient temperatures, since 
high temperatures can also cause the nucleotides in the intrastrand (hairpin) 
stem of the molecular beacon to become unpaired, with the result that the 
molecule fluoresces. For this procedure to be effective, all 15 nucleotides in 
the molecular beacon probe must be perfectly complementary to the target 


FIGURE 9.12 Use of fluorescent dyes that are attached to primers for detecting ampli¬ 
fied PCR products. The primers are marked PI and P2. 
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DNA or RNA. The sensitivity of this procedure can be improved dramati¬ 
cally if the target DNA is first amplified by PCR. 

A number of variations of the basic molecular beacon protocol have 
been developed. For example, combinations of molecular beacon probes 
may be used simultaneously provided that each one is complementary to 
a different target DNA and contains a different-color light-emitting fluoro- 
phore (Fig. 9.14). For example, when one wants to determine the genotype 
of an individual, two different molecular beacon probes are added to a 
biological sample, such as blood, that contains DNA from that individual. 
The first molecular beacon probe is labeled with one of the fluorophores 
(e.g., fluorescein), and all 15 probe nucleotides are exactly complementary 
to the wild-type sequence (Fig. 9.15). One nucleotide difference is sufficient 
to prevent hybridization. The second molecular beacon probe is labeled 
with a different fluorophore (e.g., Texas red), and the 15 probe nucleotides 
are complementary to the sequence from the mutant form (Fig. 9.15). 
Following hybridization, the appearance of green fluorescence indicates a 
homozygous normal genotype, red fluorescence indicates a homozygous 
mutant genotype, and green and red fluorescence indicates a heterozygous 
genotype. 

DNA Fingerprinting 

The DNA from a biological sample left at the scene of a crime can be ana¬ 
lyzed and compared with the DNAs of likely suspects. A match between 
evidence and a particular individual is helpful to the prosecution. In addi¬ 
tion, DNA comparisons are used to determine whether individuals have 


FIGURE 9.13 Hybridization of a molecular beacon probe to target DNA. If the target 
sequence is present, the unpaired single-stranded portion of the molecular beacon 
base pairs with its complementary sequence, thereby separating the fluorophore 
and quencher moieties. As a result, the fluorescence of the fluorophore is not 
quenched. Adapted from Tyagi and Kramer, Nat. Biotechnol. 14:303-308,1996. 
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FIGURE 9.15 Use of two different molec¬ 
ular beacon probes to distinguish 
between different genotypes. Once the 
probe is bound to target DNA, the fluo¬ 
rescence of the fluorophore is no longer 
quenched. 
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FIGURE 9.14 Molecular-beacon probes with different fluorophores. Beneath each 
molecular beacon probe is the name of the fluorophore and the wavelength of its 
maximum emission. Each fluorophore is represented (on the left) by the color of its 
fluorescence. 


been wrongly convicted of a crime. In other instances, DNA analyses help 
determine paternity and identify victims of disasters. Distinguishing indi¬ 
viduals with DNA analysis is called DNA fingerprinting (DNA typing). 
One approach for determining DNA relationships among humans relies on 
DNA hybridization to undegraded minisatellite DNA. The probes for this 
type of analysis consists of human minisatellite DNAs, sequences that 
occur throughout the human genome and consist of tandemly repeated 
sequences (Fig. 9.16). The lengths of the repeats range from 9 to 40 bp, and 
the numbers of repeats in the minisatellites range from about 10 to 30. A 
minisatellite DNA sequence at a specific chromosome location can have 
different lengths in different individuals. This variability is due to either a 
gain or a loss of tandem repeats, probably during DNA replication. These 
changes do not have any biological effect because minisatellite DNA does 
not encode any proteins. Unrelated individuals generally have minisatel¬ 
lites that differ in length, but children inherit one set of minisatellite DNA 
sequences from each parent. For minisatellite DNA typing, the sample 
DNA is digested with a restriction enzyme, and the fragments are sepa¬ 
rated on an agarose gel and transferred by blotting them onto a nylon 
membrane. The membrane is hybridized sequentially with four or five 
separate labeled minisatellite DNA probes, each of which recognizes a dis¬ 
tinct DNA sequence. After each hybridization reaction, the bands in which 
the probe has bound to the digested DNA sample are visualized by autora¬ 
diography, and the banding pattern for each sample is noted (Fig. 9.17). 
Before the next probe is used, the first probe is completely removed 
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(stripped) from the membrane. Since each hybridization and autoradiog¬ 
raphy step can take up to 10 to 14 days, the entire process may take many 
weeks, and even several months. 

A minisatellite DNA pattern (fingerprint) represents the repertoire of 
the lengths of some of these sequences in an individual. Because of the 
extensive variability in human minisatellite DNA sequences, the chance of 
finding two individuals in the population with the same DNA fingerprint 
is about 1 in 10 5 to 1 in 10 s . Therefore, individuals' DNA banding patterns 
based on minisatellite DNA sequences are almost as unique as their finger¬ 
prints. 

RAPD 

Not only are DNA banding patterns important for forensic analyses, they 
are also useful in distinguishing among different plant cultivars. Random 
amplified polymorphic DNA (RAPD) markers may be used for this pur¬ 
pose. With this procedure, an arbitrary oligonucleotide primer, usually 9 
to 10 bp long, that does not contain any palindromic sequences and has a 
G+C content of 50 to 80% is added to a sample of plant chromosomal 
DNA; virtually any oligonucleotide sequence will suffice. Because of its 
short sequence, the added oligonucleotide will pair with the chromosomal 
DNA at many sites, sometimes including opposite strands on the target 
DNA. When the 3' ends of the oligonucleotides on opposite strands of the 
DNA face each other, the DNA in between can be amplified (Fig. 9.18). 
Although the sequence of each primer is known, it is not known which 
oligonucleotide, if any, will be effective in priming the PCR. Whenever a 
primer can hybridize to both strands of the target DNA in the proper ori¬ 
entation and the two sites are about 100 to 3,000 bp from each other, the 
intervening DNA region will be amplified via PCR. The DNA fragments 
of characteristic size that are produced can be visualized following poly¬ 
acrylamide gel electrophoresis. The number of amplified DNA fragments 
in a sample is dependent on the primer and the genomic DNA used. Each 
time that the same primer is used with the same target DNA, the amplified 
products will be the same. A single nucleotide substitution in a primer will 
result in a complete change in the RAPD pattern. Thus, the RAPD finger¬ 
prints of different plant cultivars can be compared when the same set of 
oligonucleotide primers is used. To fingerprint the DNAs of two very 
similar plant strains or cultivars, it is often necessary to perform the RAPD 
procedure with a number of different arbitrary primers with known 



FIGURE 9.16 Schematic representation of human minisatellite DNA. Only one DNA 
strand is shown. In this example, the repeating unit is 9 bp, and there are 5, 6, and 
7 repeating units per cluster (although 10 to 30 repeating units are more common). 























sequences until differences are revealed (Fig. 9.19). Like other molecular 
markers, RAPDs can be used to characterize whole genomes, individual 
chromosomes, or, less commonly, specific genes. Although the procedure 
was originally developed for plants, it is also useful in the characterization 
of microorganisms. 

In comparison with other procedures for characterizing complex DNA, 
the RAPD procedure has a number of advantages. (1) The same (universal) 
set of oligonucleotide primers can be used for all plant species. (2) No 
genomic libraries, radioactivity. Southern transfers, or DNA hybridization 
reactions are required, so a large number of samples may be easily and 
rapidly characterized. (3) The process can be automated. Moreover, with 
conventional PCR analysis it is necessary to know the sequence of a specific 
gene or gene segment that is the target for amplification. On the other 
hand, amplification in RAPD analysis occurs anywhere in a genome where 


FIGURE 9.17 Southern blot of a forensic DNA sample. The DNA samples from the 
victim, the defendant's shirt, and the defendant were treated with the same restric¬ 
tion enzyme. Here, the banding pattern of the DNA extracted from the blood on the 
defendant's shirt is identical to the victim's DNA banding pattern and different 
from the defendant's pattern. The sizes of the DNA molecules in these bands are 
estimated by comparison with the positions of the sizing standards. 
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FIGURE 9.18 Binding of a single random oligonucleotide (arrow) to the chromosomal 
DNA of an animal, plant, or microbe. When two of the oligonucleotides on opposite 
strands are oriented facing one another and are 100 to 3,000 bp apart, the inter¬ 
vening DNA is amplified by PCR. 


there are two sequences complementary to the primer that are within the 
length limits of the PCR. 

With this technology, scientists were able to distinguish six inbred 
maize lines from each other, and maize hybrids produced by genetic cross¬ 
ings of these inbred lines were shown to have the PCR products of their 
parental inbred lines. RAPD markers have also been used to screen dif¬ 
ferent strains of the fungus Leptosphneria maculans, which is the causal agent 
of blackleg disease in crucifers. Differences between avirulent (non-disease- 
causing) and virulent (disease-causing) strains could be distinguished on 
the basis of specific RAPD markers, making it easier for scientists to pro¬ 
duce an avirulent strain that could be used as a biological control agent that 
helps to prevent blackleg disease. 


FIGURE 9.19 Ethidium bromide-stained bands following polyacrylamide gel electro¬ 
phoresis of PCR-amplified plant DNA. Three separate oligonucleotide primers 
were used to amplify fragments from each of the two cultivars. Cultivars 1 and 2 
show identical patterns of bands with oligonucleotides A and C. However, they 
have different patterns when oligonucleotide B is used; hence, oligonucleotide B 
can be used to distinguish between cultivars 1 and 2. 
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Real-Time PCR 

By labeling the DNA that is amplified in a PCR with a fluorescent dye and 
monitoring the fluorescence that results when the dye bound to double- 
stranded DNA is irradiated with light of a certain wavelength, it is possible 
to "watch" the production of PCR products. Moreover, this approach 
allows one to quantify the amount of a specific DNA fragment in the 
starting material. Labeling the DNA is achieved using any one of a variety 
of protocols. In the simplest approach to labeling DNA, researchers add 
dyes that bind to double-stranded DNA and emit fluorescence, and the 
fluorescence intensity increases in proportion to the concentration of 
double-stranded DNA (Fig. 9.20). 

Real-time PCR may be described as occurring in four phases (Fig. 
9.21). In the first, or linear, phase (Fig. 9.21, phase 1), which generally takes 
about 10 to 15 cycles, fluorescence emission at each cycle has not yet risen 
above the background level. In the early exponential phase (Fig. 9.21, 
phase 2), the amount of fluorescence reaches a threshold at which it is 
significantly higher than the background. The cycle at which this occurs is 
known as the threshold cycle ( C T , or CP, depending upon the manufac¬ 
turer of the PCR equipment). The C T value is inversely correlated with the 
amount of target DNA in the original sample. During the exponential 
phase (Fig. 9.21, phase 3), the amount of product doubles in each cycle 
under ideal conditions, while in the plateau stage (Fig. 9.21, phase 4), the 
reaction components become limited and measurements of the fluores¬ 
cence intensity are no longer useful. To quantitate the amount of target 
DNA in a sample, a standard curve is first generated by serially diluting a 
sample with a known number of copies of the target DNA, and assuming 
all samples are amplified with equal efficiency, the C r values for each dilu¬ 
tion are plotted against the starting amount of sample (Fig. 9.22). The 
number of copies of a target DNA in a sample can be determined by 
obtaining the C T value for the sample and extrapolating the starting 
amount from the standard curve. In addition, since during the exponential 
phase the DNA doubles with each cycle, a sample that has four times the 
number of starting copies of the target sequence compared to another 
sample would require two fewer cycles of amplification to generate the 
same number of product strands. 

Among its many other uses, real-time PCR has been used to monitor 
Cryptosporidium parvum (a waterborne protozoan parasite that is the caus¬ 
ative agent of a range of human diseases, including persistent diarrhea and 
severe infections, in infected individuals). This approach is likely to replace 
the more imprecise and time-consuming traditional methods of monitoring 
C. parvum infections, such as histological staining. Similarly, other 
researchers have reported using real-time PCR to quantitate S. enterica con¬ 
tamination in food samples. In this case, food samples (chicken and mung 
beans were tested) were rinsed with 100 to 250 mL of water or with a 
physiological saline solution. The liquid was filtered to remove particulate 
matter and then filtered to capture the Salmonella sp. cells. The cells were 
removed from the filter membrane, lysed, and subjected to real-time PCR. 
In this case, the entire procedure took only approximately 3 hours and was 
able to detect and quantitate cell numbers as low as 7 x 10 2 colony-forming 
units (i.e., cells) per 100 mL of liquid. Compared to the existing method¬ 
ology, real-time PCR offers a dramatic improvement in both the sensitivity 
of detection and the time that it takes to complete the analysis. 
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FIGURE 9.20 The fluorescent dye SYBR green does not bind to single-stranded DNA 
(A), binds to double-stranded DNA as it is synthesized (B), and is bound to the 
double-stranded amplified DNA (C). Only the bound DNA fluoresces. 


In addition to its use in the measurement of pathogenic agents in the 
environment, a variant of real-time PCR may be used to quantitate the 
levels of a variety of mRNAs in different eukaryotic tissues or prokaryotic 
cells. In this case, since the initial target is RNA and not DNA, a reverse 
transcription (RT) step is needed before the real-time PCR. In the first step 
of real-time RT-PCR, the mRNA sample is reverse transcribed to generate 
complementary DNA (cDNA). This may be done in the same tube as the 
subsequent PCR, or the RT reaction and PCR may be carried out in separate 
tubes. Many of the more traditional methods of monitoring gene expres¬ 
sion, including Northern hybridization, ribonuclease (RNase) protection 
assays, and RNA dot blot hybridizations, are both limited in sensitivity and 
difficult to quantify. However, with the increasingly popular technique of 
real-time RT-PCR, it is possible to detect and quantify mRNA levels that are 
about 10,000- to 100,000-fold lower than those measurable by traditional 
techniques. With real-time RT-PCR, even a single copy of a transcript may 
be detected. The very low levels of RNA that are required for this proce¬ 
dure make it the method of choice for monitoring mRNA levels. 

Immunoquantitative Real-Time PCR 

The detection limits of many commercially available immunological methods 
for measuring levels of pathogenic microorganisms are often insufficient to 
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FIGURE 9.21 A plot of A Rn (normalized fluorescence) versus cycle number in a real¬ 
time PCR experiment. Four phases of PCR are shown. (1) A linear phase, where 
fluorescence emission is not yet above background level. (2) An early exponential 
phase, where the fluorescence intensity becomes significantly higher than the back¬ 
ground. The cycle at which this occurs is generally known as C T . (3) An exponential 
phase, where the amount of product doubles in each cycle. (4) A plateau phase, 
where reaction components are limited and amplification slows down. 


FIGURE 9.22 Plot of C T versus the starting amount of a target nucleotide sequence. 
Fluorescence detection is linear over several orders of magnitude. 











Molecular Diagnostics 


361 


perceive low but still potentially dangerous levels of these organisms. 
Notwithstanding the high degree of specificity that antibodies provide, it 
would be advantageous to be able to increase the sensitivity of various 
immunological assay procedures. One way to do this is to develop a pro¬ 
tocol that combines the specificity of antibodies with the sensitivity of PCR. 
Figure 9.23 compares an ELISA-type protocol with an immunoquantitative 
real-time PCR procedure. With the ELISA method (Fig. 9.23A), the first anti¬ 
body is coupled to the surface of a microtiter plate. The added antigen binds 
to the first antibody. When the second antibody is added, it binds to a dif¬ 
ferent epitope on the antigen. The bound antigen is visualized by the action 
of alkaline phosphatase, bound to the second antibody, which turns a color¬ 
less substrate into a colored product. With the immuno quantitative real-time 
PCR procedure (Fig. 9.23B), instead of alkaline phosphatase, a streptavidin- 
biotin complex links the second antibody to a 246-bp DNA fragment with a 
known sequence. Once the immunological complex has formed, it may be 
visualized and quantified by performing real-time PCR in the well of the 
microtiter plate, thereby significantly amplifying the signal from the immu¬ 
nological complex. In fact, it has been estimated that this procedure is 
approximately 1,000-fold more sensitive than an ELISA. 

Ancestry Determination 

By examining a number of different single-nucleotide polymorphisms 
(SNPs) (i.e., minor variations in DNA sequence) in an individual and com¬ 
paring the pattern of the SNPs to those of other individuals in the popula¬ 
tion, it is possible to infer information regarding an individual's ancestry. 
For an analysis of an individual's ancestry, three different types of DNA can 
be examined: autosomal DNA (which includes all of a person's DNA 
except for the X and Y chromosomes and mitochondrial DNA), which 
originates from a combination of a person's parents' DNA; paternal DNA 
(i.e., the Y chromosome), which is passed on from father to son; and 
maternal DNA (i.e., mitochondrial DNA and the X chromosome), which is 
passed on from a mother to all of her children. 

To perform an analysis of an individual's ancestry (or for paternity 
testing or forensic analysis), DNA is typically extracted and purified from 
buccal swabs (i.e., from cheek cells) or from blood samples. Stretches of 
DNA are then amplified by PCR using primers that target specific regions 
of the genome. The DNA samples are then separated by size on a small 
column by a technique known as capillary electrophoresis (this method has 
generally replaced gel electrophoresis, which was previously used to sepa¬ 
rate these small DNA fragments but was slower and less amenable to auto¬ 
mation). If the PCR primers are labeled with fluorescent dyes before the 
PCR amplification reaction, fluorescent samples are eluted from the capil¬ 
lary column in a characteristic pattern of bands (in much the same way that 
DNA bands form a pattern unique to a particular fragment following gel 
electrophoresis). The sizes of the amplified DNA bands in specific regions 
are determined and are referred to as alleles. With autosomal DNA, each 
person should have two alleles at each site, one from the father and one 
from the mother. Each known allele has a determined frequency of occur¬ 
rence in the general population and among various ethnic groups. After 
testing for approximately 150 to 350 different SNPs, the frequency of certain 
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FIGURE 9.23 Antigen detection by ELISA with the substrate color change detected 
spectrophotometrically (A) and by immunoquantitative real-time PCR with the 
amplified DNA detected by measuring the fluorescence of a dye bound to the 
double-stranded DNA (B). 


alleles in an individual is compared to the frequency of those alleles in 
various ethnic groups. This provides an indication of the ethnic background 
or ancestry of that individual. For example, this type of testing may indicate 
that an individual is genetically 70% Northern European, 17% Middle 
Eastern, and 13% Native American. 

Since mitochondrial DNA changes only very little over many genera¬ 
tions, characterization of mitochondrial DNA is an ideal means of tracking 
migrations over many hundreds of generations of human genetic history. 
Analysis of mitochondrial DNA indicates that the observed genetic varia¬ 
tion in human populations may be divided into divisions of ethnically 
similar individuals called haplotypes. Table 9.3 summarizes the currently 
accepted mitochondrial DNA haplotypes and the groups associated with 
these haplotypes. The root of all human lineages is the L groups in Africa. 
All other groups diverged from these groups after early humans began to 
migrate out of Africa around 150,000 years ago. 

Analysis of parental (Y chromosome) haplotypes has been used to 
examine the claim that all members of a priestly line of Jewish males called 
"Kohanim" are descended from the family of the biblical Aaron, the brother 
of Moses. According to Jewish tradition, membership in this priestly line 
may be acquired only by males whose biological fathers are Kohanim. 
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When Y-linked genetic markers were examined among Jews claiming to be 
Kohanim, as well as men who either were not Jewish or did not claim to be 
Kohanim, the limited variation in the markers examined was entirely con¬ 
sistent with a 3,300-year-old origin of this priestly line in a single male or a 
small group of related males. Interestingly, these same Y chromosome 
genetic markers are found among the Lemba, a South African tribal group 
claiming paternal kinship with the Jews of Yemen. 

By mid-2008, there were nearly 30 companies marketing genetic 
ancestry kits; each test costs from $100 to $900. Moreover, in the few years 
that they have been available, more than 500,000 people have purchased 
these tests, and there is every indication that the demand will continue to 
grow. Some geneticists, anthropologists, and epidemiologists have publicly 
expressed concern that naive laypersons, anxious for definitive informa¬ 
tion about their personal genetic ancestry, often misinterpret the results of 
these tests. For example, the fact that a particular allele or haplotype is 

TABLE 9.3 Human mitochondrial haplotypes and the groups associated with these haplotypes 

Haplotype group Associated traits 

A Originated in Asia ~60,000 years ago; currently found widely in Asia; a precursor of Native Americans 

B Originated in Asia ~50,000 years ago; subgroup B2 is one precursor of Native Americans 

C Originated in Asia ~60,000 years ago; includes the Siberian region of northern Asia and is a precursor of 

Native Americans 

CZ Found in modern Eurasian populations in northern and eastern Asia, including Siberia 

D Originated in Asia ~60,000 years ago; along with groups A, B, and C, this group is thought to have pro¬ 

duced Native Americans; currently found in northern and eastern Asia 
E Not a well-characterized group; currently found among some people in Argentina 

F Originated in eastern Asia; from haplotype group Rl; currently found in China and Japan 

G Currently found in northeastern Siberia; also found among indigenous people of Kamchatka 

H Common in the Middle East and northern Africa; ancestor to about half of all Europeans; a prominent 

subgroup of HV 

HV Originated ~20,000 years ago; a progenitor of groups H and V; commonly found in modern western 

Europeans 

I Originated ~30,000 years ago; currently found in both southern Europe and northern Africa 

J Originated in Mesopotamia ~10,000 years ago; currently found in Russia and eastern Europe 

JT Derived from group R and a progenitor of groups J and T 

K Originated ~18,000 years ago in Eurasia; currently found in some parts of western Europe 

LI Originated ~150,000 years ago in Africa; this haplotype group represents the group from which all of 

humankind is thought to descend; currently found in West and Central Africa 
L2 Originated from haplotype group LI in Africa -70,000 years ago; currently commonly found in sub- 

Saharan Africa and among American blacks 

L3 Originated from haplotype group LI; gave rise to haplotype groups M and N; currently commonly found 

in East Africa 

M Originated from haplotype group L3 -80,000 years ago; this group is thought to have migrated into 

Eurasia -60,000 years ago; currently found throughout southern Asia 
Ml Believed to be the result of migration from North Africa and parts of Asia to sub-Saharan Africa 

N Originated from haplotype group L3 -80,000 years ago; this haplotype group is believed to be the 

progenitor of nearly all European haplotype groups; groups B, U, F, HV, H, and V all descend from 
haplotype group N 

Nla Currently found in Iran and other parts of western and central Asia 

Nib Common in Near and Middle East regions of Asia, as well as among Ashkenazi Jewish people 

Pre-HV Widely represented in the Middle East and parts of eastern Africa; the ancestor of haplotype groups HV, 

H, and V 

Q Currently found in the southern Pacific region, especially New Guinea and Melanesia 

R Descended from haplotype group N; currently found throughout Asia and Eastern Europe 
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5' ATGACCAACA TTCGAAAATC CCACCCACTA ATAAAAATTA TAAACAACTC 3' 

FIGURE 9.24 (A) PCR amplification of a portion of mammalian mitochondrial DNA. 
(B) The consensus sequence of the first 50 nucleotides of the cytochrome b gene. 
Minor variations from the consensus sequence are characteristic of specific genera 
and species of various mammals. Adapted from Hsieh et al., Forensic Sci. hit. 
122:7-18, 2001. 


quite common in certain populations does not mean that its presence is a 
definitive diagnostic of membership in those populations. There is often 
high genetic diversity within populations, and gene flow can readily occur 
between populations. Thus, a particular allele could have been inherited 
from a population in which it is less common rather than from the popula¬ 
tion in which it appears to be "diagnostic," leading some individuals to 
mistakenly believe that they are genetically part of a particular ethnic or 
racial group. 

Animal Species Determination 

In many parts of the world, numerous large mammalian species have been 
hunted to the verge of extinction because of the trade in their skin, bones, 
horns, or other body parts. As part of an effort to stem the illegal traffic in 
wild-animal remains and enforce international conventions designed to 
prevent this trade, a number of laboratories have been set up to determine 
the species of origin from powdered animal remains (so prepared because 
of their supposed medicinal properties). Currently, the method of choice 
for animal species determination involves DNA typing. More specifically, 
using PCR, a portion of the animal's cytochrome b gene, which is found in 
the mitrochondrial DNA, is amplified and then sequenced (Fig. 9.24). This 
locus was chosen because it is both sufficiently conserved so that it is 
present in all mammals and sufficiently polymorphic that members of dif¬ 
ferent, but closely related, species can be distinguished. The primers used 
in this method amplify a 486-bp DNA fragment, which is sufficient for 
DNA sequencing and to identify most mammals despite the fact that the 
sample DNA is often somewhat degraded. To ensure that high-quality data 
are obtained, it is necessary to (1) start with 20 to 50 ng of DNA template 
and (2) use a control DNA sample (usually mouse or cow DNA) that is 
treated in parallel to the target DNA. 

Automated DNA Analysis 

More than 1,400 different organisms (bacteria, fungi, viruses, and protozoa) 
have been identified as being pathogenic to humans. Moreover, new patho- 












Molecular Diagnostics 365 


gens are constantly being identified and characterized. To limit the damage 
from either a natural epidemic or a bioterrorist attack, it is necessary to 
rapidly identify the organism(s) that is the source of the infectious outbreak 
so that appropriate public health measures may be instituted as rapidly as 
possible. In addition to human diseases, it is also important to rapidly iden¬ 
tify the causative agents of outbreaks of animal and plant diseases. 

By performing a combination of PCR and electrospray ionization mass 
spectrometry (Fig. 9.25), it is possible to rapidly and accurately identify a 
wide range of human pathogens. Small aliquots of the DNA that is isolated 
from a test sample are placed into a number of different wells of a micro- 
titer plate. Each well contains a pair of PCR primers that have been 
designed to amplify a gene product from a broad group of organisms 
within a selected domain of microbial life. For example, workers have 
reported using 12 primers that are targeted to universally conserved 
sequences and 6 primers that are targeted to broad divisions of microbial 
life (such as bacilli). The products of each of the PCRs are electrosprayed 
into a mass spectrometer, and the DNA base sequences of the various 
samples are determined. This technique allows scientists to very rapidly 
hone in on the nature of an infectious agent. Of course, this technique is 
facilitated by the fact that the genomic DNA sequences of a large number 
of microbes have already been determined. Moreover, with the very rapid 
progress that has been made recently in DNA sequence analysis, it is rea¬ 
sonable to expect that all known pathogens will be fully sequenced by 
2015. 

Molecular Diagnosis of Genetic Disease 

The ability to diagnose the occurrence of specific inherited diseases in 
humans at the genetic level makes it possible for individuals to discover 
whether they or their offspring are at risk. DNA analysis can be used for 
the identification of carriers of hereditary disorders, for prenatal diagnosis 
of serious genetic conditions, and for early diagnosis before the onset of 
symptoms. 

Tests at the DNA level are definitive for determining the existence of 
specific genetic mutations. Previously, genetic testing relied almost exclu¬ 
sively on biochemical assays that scored either the presence or the absence 
of a gene product. A DNA-based test does not, however, require expression 
for detection of the mutant gene, thereby making it theoretically possible to 
develop screening assays for all single-gene diseases. 


FIGURE 9.25 Flowchart of an automated system to identify the pathogenic microbes 
in an environmental sample. 
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Screening for Cystic Fibrosis 

Often, screening for genetic diseases can be rather complex. This reflects 
the fact that instead of a disease being the consequence of a single altera¬ 
tion to the wild-type DNA, as is the case with sickle-cell anemia (see 
below), many diseases are caused by any one of a large number of genetic 
alterations to the normal DNA for that gene. For example, cystic fibrosis, 
one of the most common lethal autosomal recessive disorders in Europeans 
and their descendants, with an incidence of approximately 1 in every 
2,500 live births and a carrier frequency of approximately 1 in 29, is 
caused by mutations to the cystic fibrosis transmembrane conductance 
regulator (CFTR) gene that result in defects in chloride ion transport. 
There are currently nearly 1,400 known mutations to the CFTR gene that 
can result in the development of cystic fibrosis. Screening individuals 
who may be at risk for cystic fibrosis for 1,400 different mutations is a 
daunting task. Fortunately, some of the mutations that cause cystic 
fibrosis are much more common than others (Table 9.4). In fact, over 90% 
of cystic fibrosis patients carry at least one AF508 allele, and nearly 50% 
of cystic fibrosis cases are individuals who are homozygous for AF508. 
Despite the fact that separate tests are required for each mutation, it is 
estimated that screening individuals for AF508 and for the next 20 most 
common mutations should identify approximately 98% of cystic fibrosis- 
affected individuals and carriers. 

Current diagnostic tests for cystic fibrosis include several different 
techniques. One of the most widely used methods is allele-specific oligo¬ 
nucleotide dot blots (also called allele-specific hybridization). With this 
technique, genomic DNA or cDNA from an individual is amplified by 
PCR and, following transfer to a membrane, is hybridized (separately) to 
labeled oligonucleotide probes for the mutant (usually AF508) and wild- 
type genes (Fig. 9.26). In this way, it is possible to distinguish between 
normal individuals, cystic fibrosis carriers, and cystic fibrosis-affected 
individuals (Fig. 9.27). With this technique, the probe or the probe-target 
complex may be labeled in a variety of ways, including the use of radioac¬ 
tivity, enzymes that produce color change when acting on certain sub¬ 
strates (see the discussion of the EFISA procedure above), and fluorescent 
dyes. This technique may be automated and is currently commercially 


TABLE 9.4 The most common mutations of the CFTR protein that lead to cystic 
fibrosis 


Mutation designation Amino acid change to the CFTR protein 


AF508 

G542X 

W1282X 


Deletion of phenylalanine at position 508 
Replacement of glycine at position 542 by a stop codon 
Replacement of tryptophan at position 1282 by a stop 
codon 


N1303K 

1717-1G>A 

R553X 

I148T 

3120+1G>A 


Replacement of asparagine at position 1303 by lysine 
Replacement of glycine by alanine at the last nucleotide 
in the intron proceeding nucleotide 1717 in the cDNA 
Replacement of arginine at position 553 by a stop codon 
Replacement of isoleucine at position 148 by threonine 
Replacement of glycine by alanine at the first nucleotide 
in the intron following nucleotide 3120 in the cDNA 


Adapted from Eshaque and Dixon, Biotechnol. Adv. 24:86-93, 2006. 
Amino acids are numbered starting at the N-terminal end of the protein. 
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Genomic DNA 


B No hybridization 



- A - Genomic DNA 

FIGURE 9.26 Labeled oligonucleotide probe hybridizes to a completely complemen¬ 
tary genomic DNA but, under stringent conditions (A), not to a DNA sequence in 
which one of the bases in the middle has been altered (B). In this case, the probe was 
designed to hybridize to wild-type DNA. Other probes, designed to hybridize to 
the genomic DNA of known mutations, would not bind to the wild-type DNA. 

available in a kit form that can detect 12 frequent and 17 rare cystic fibrosis 
mutations. 

Another method that has been marketed as a kit is based on the PCR 
amplification of specific alleles. Using this protocol, several PCRs are per¬ 
formed simultaneously for each DNA sample—the primers anneal to dif¬ 
ferent regions for different mutations. Following amplification, the presence 
of a DNA band of a particular size indicates that a specific mutation is 
present. In this case, different-size DNA fragments are typically separated 
either by gel electrophoresis or by capillary electrophoresis. This test is 
quite rapid, and it has the ability to detect a variety of different mutations. 
However, it does not distinguish between homozygotes and heterozygotes, 
so a positive response must be followed up by additional tests to determine 
whether a positive test is indicative of a cystic fibrosis gene carrier or 
affected individual. 

The PCR/OLA procedure (described in detail below) is also com¬ 
monly used to detect cystic fibrosis mutations. This technique is consid¬ 
ered to be highly accurate compared to many other protocols and has the 
highest detection rate of any of the diagnostic tests for this disease. 
Moreover, it is readily amenable to automation. Notwithstanding the suc¬ 
cess with all of the procedures mentioned above, researchers continue to 
refine and develop these and other approaches to the diagnosis of cystic 
fibrosis. 

Sickle-Cell Anemia 

Sickle-cell anemia is a genetic disease that is the result of a single-nucle¬ 
otide change in the codon for the sixth amino acid of the (3 chain of the 
hemoglobin molecule. In individuals homozygous for the defect (S/S), the 
shape of the red blood cells is irregular (sickle shaped) because the confor¬ 
mation of the hemoglobin molecule is distorted by a single amino acid 
change from glutamic acid to valine. The biological ramifications of this 
genetic alteration are severe anemia and progressive damage to the heart, 
lungs, brain, joints, and major organ systems. The anemia is caused by the 
inability of the mutated hemoglobin to carry sufficient oxygen. The life 
expectancy for S/S homozygotes is quite short. Heterozygous individuals 
(A/S) (genetic carriers) have normal-shape red blood cells and no symp¬ 
toms unless they are subjected to extreme conditions, such as high altitude 
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or extremes of temperature, that lower the oxygen supply. If both parents 
are heterozygous, there is a 25% chance that a child of theirs will have 
sickle-cell anemia, i.e., will be an S/S homozygote. The sickle-cell anemia 
gene occurs with high frequency among black Africans and their descen¬ 
dants and in Hispanic populations. Carrier screening for the sickle-cell 
anemia gene is routinely conducted in the United States so that those indi¬ 
viduals who are at risk for transmitting the gene to their offspring can be 
identified. One of the test systems is described below. 

The single-nucleotide change in the (3-globin gene that causes sickle¬ 
cell anemia by chance abolishes a CvnI restriction endonuclease site. This 
restriction enzyme recognizes the sequence CCTNAGG and cleaves the 
DNA between the C and the T. (The letter N indicates that any one of the 
four nucleotides can occupy this position.) In the normal gene, the DNA 
sequence is CCTGAGG, whereas in the sickle-cell anemia gene, the 
sequence is CCTGTGG. This difference forms the basis for a DNA diag¬ 
nostic assay (Fig. 9.28A). 

After two oligonucleotide primer sequences that flank the CvnI site 
are added, a small amount of sample DNA can be amplified by PCR (Fig. 
9.28B). The amplified DNA is digested with CvnI (Fig. 9.28C), and the 
cleavage products are separated by gel electrophoresis and visualized by 
ethidium bromide staining of the DNA in the gel. If the CvnI site is 
present, a specific set of DNA fragments is observed (Fig. 9.28D). A dif¬ 
ferent profile of DNA fragments occurs if the CvnI site is absent. By this 
procedure, the genetic makeup of a tested person can be determined 
quickly, directly, and easily. Moreover, because of the fortuitous loss of the 
CvnI site, this assay functions without the need for a target-probe hybrid¬ 
ization reaction. 

The PCR/OLA Procedure 

Obviously, not all genetic changes that produce defective genes affect 
existing restriction endonuclease sites. Therefore, other strategies for 
detecting single-nucleotide changes are required. One of these procedures 


FIGURE 9.27 Allele-specific oligonucleotide dot blot to diagnose individuals who are 
either carriers of a mutant CFTR gene (heterozygotes) or affected by the disease 
(homozygotes). The dark dot blot indicates that the labeled oligonucleotide has 
bound to the individual's DNA. 
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combines PCR with an oligonucleotide ligation assay (OLA); not surpris¬ 
ingly, it is called PCR/OLA. 

Let us assume that in a normal gene at a specific site (say, nucleotide 
number 106) the nucleotide pair is A T; in the mutant form, the nucleotide 
pair at this site is G C. Knowledge of the sequence of nucleotides on both 
sides of position 106 enables the design and use of two short (20-nucle- 
otide) adjacent oligonucleotide sequences that are complementary to one of 
the two native DNA strands (Fig. 9.29). The essential feature of this pair of 
oligonucleotides is that one of them (probe X) has as its last base at the 3' 


FIGURE 9.28 Detection of the sickle-cell anemia gene at the DNA level. (A) A portion 
of the sequence of the wild-type (HbA) and sickle-cell (HbS) human p-globin gene. 
The amino acids (numbered from the N-terminal end of the peptide chain) encoded 
by this portion of the DNA are shown above the DNA sequence. (B) PCR amplifica¬ 
tion of the portion of the p-globin gene containing the CvnI recognition site that is 
altered in the mutant gene. (C) CvnI digestion of the PCR products. The normal 
(wild-type) gene has three CvnI sites between the PCR primers, and the mutant 
gene has two. (D) Size distribution of fragments following gel electrophoresis of 
CvnI-digested PCR-amplified P-globin DNA. AA, homozygous condition for the 
normal P-globin gene; AS, heterozygous condition; SS, homozygous condition for 
the sickle-cell anemia p-globin gene. 
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A Synthesize a pair of oligonucleotide probes 

Probe X 

(S' 


Probe Y 


B Hybridize probes to PCR-amplified DNA 
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Results with mutant DNA 



FIGURE 9.29 PCR/OLA procedure. B, biotin; D, digoxigenin; AP, alkaline phos¬ 
phatase; SA, streptavidin; A, adenine; C, cytosine; G, guanine; T, thymine. 
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end the nucleotide that is complementary to the nucleotide at position 106 
of the normal sequence. The other oligonucleotide (probe Y) starts at its 5' 
end with a nucleotide that is complementary to the nucleotide immediately 
adjacent to position 106. When these two probes are hybridized with target 
DNA containing the normal sequence (which has been amplified by PCR), 
the nucleotide at the 3' end of probe X base pairs with the target DNA, and 
probe Y is aligned so that its 5' end lies next to the 3' end of probe X. The 
addition of DNA ligase to the reaction covalently joins probe X and probe 
Y. By contrast, when these two probes are hybridized to mutant DNA in 
which the nucleotide at position 106 is altered, the nucleotide at the 3' end 
of probe X is mismatched and is not able to pair with nucleotide 106 in the 
target DNA sequence; probe Y, however, is perfectly aligned. In this case, 
DNA ligase cannot join probe X and probe Y because of the single-nucle¬ 
otide misalignment. 

Other oligonucleotides (probes) can also be chemically synthesized to 
give a perfect base pair match when nucleotide 106 is mutated. Obviously, 
with this second set of probes, ligation occurs when they are hybridized to 
target DNA that contains the mutant nucleotide, whereas with normal 
target DNA, the single nucleotide pair mismatch prevents the ligation of 
the probes. In short, PCR/OLA is designed to distinguish between two 
possibilities: ligation and no ligation of two input probes. 

To determine whether ligation has occurred between two indicator 
probes, probe X is labeled at its 5' end with biotin and probe Y is labeled at 
its 3' end with digoxigenin. The low-molecular-weight compound digoxi- 
genin serves as an antibody-binding indicator. After the hybridization and 
ligation steps are carried out, the DNA is denatured to release the hybrid¬ 
ized probes, and the entire mixture is transferred to a small plastic well that 
has been coated with streptavidin. The well is then washed to remove 
unbound material, so only the biotin-labeled probe DNA remains bound. 
Next, antidigoxigenin antibodies, which have been previously coupled to 
alkaline phosphatase, are added to the well. After an additional washing 
step to remove unbound conjugated antidigoxigenin antibodies, a colorless 
chromogenic substrate is added. The appearance of color in the well indi¬ 
cates that antidigoxigenin antibodies have bound to digoxigenin and that 
the digoxigenin-labeled probe was ligated to the biotinylated probe. If no 
color appears, then no ligation occurred. 

With two pairs of probes, it is possible to ascertain the genetic makeup 
of any tested individual at a particular site. For example, heterozygous 
individuals yield positive results with both pairs of probes. The DNA from 
people with two copies of the normal gene gives a positive response only 
with the set of probes that contains the nucleotide complementary to the 
nucleotide at the normal site. Finally, DNA from individuals with two 
altered gene copies will give a positive response only with the set of probes 
that is designed to detect the mutant site. To minimize the amount of the 
original sample DNA that is required for the assay, the segment of the 
target DNA sequence that contains the nucleotide site to be tested is ampli¬ 
fied by PCR before the hybridization reaction. 

Overall, the PCR/OLA system is rapid, sensitive, and highly specific. 
It has even been automated with a robotic workstation to carry out the 
steps of the assay procedures. Under these conditions, as many as 1,200 
ligation reactions can be conducted per day. 

The ligase chain reaction assay is a simpler, albeit less sensitive, variant 
of the PCR/OLA system. Sample DNA is mixed with an excess amount of 
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a pair of OLA indicator probes (as described above) in the presence of a 
heat-resistant DNA ligase. After an initial ligation reaction at 65°C, the 
temperature is raised to 94°C to denature the probe-target DNA hybrid 
and then lowered to 65°C to allow hybridization of the free, nonligated 
OLA indicator probes to the target DNA. The cycle is repeated 20 times. If 
the OLA indicator probes match the target DNA perfectly, then ligation will 
occur at 65°C during each cycle, and after 20 cycles, enough ligation 
product (probe X joined to probe Y) will accumulate to be observed by 
either gel electrophoresis or an ELISA detection system. If no ligation 


FIGURE 9.30 Schematic representation of the functioning of a padlock probe. (A) 
When the bases at the 5' and 3' ends of the probes are completely paired to the 
target DNA, ligation can take place. When there is a single-base mismatch at the 3' 
end of the probe, ligation cannot occur and the probe assumes a conformation that 
does not allow hybridization. (B) Under stringent conditions, the ligated probe 
remains bound to the target DNA, which is bound to the surface of a 96-well micro¬ 
titer plate. The nonligated probe is removed during washing. The bound probe is 
detected by interaction with the reporter molecules. If the reporter is biotin, then 
avidin and a biotinylated enzyme, such as alkaline phosphatase, are added sequen¬ 
tially. A colored well indicates that the probe is present and bound to the target 
DNA or RNA. 
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FIGURE 9.31 Detection of a single-base mutation with fluorescence-labeled PCR 
primers. (A) Primers PI and P2 amplify DNA from the wild-type sequence. The 
same primers cannot amplify DNA from the mutant sequence because primer PI is 
mismatched with this DNA. Primer PI is labeled at its 5' end with rhodamine (red). 
Primer P2 is unlabeled. (B) Primers P3 and P2 amplify DNA from the mutant but 
not the wild-type sequence. Primer P3 is labeled at its 5' end with fluorescein 
(green). Primer P2 is unlabeled. The plus and minus signs denote wild-type and 
mutant sites, respectively. The genotypes 1/1,1/2, and 2/2 produce PCR products 
that contain rhodamine only, rhodamine and fluorescein, and fluorescein only and 
that fluoresce red, yellow, and green, respectively. 
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FIGURE 9.32 TaqMan assay. (A) A 
TaqMan probe that is complementary 
to the wild-type DNA is added prior to 
PCR amplification of the DNAsequence. 
The probe contains a fluorescent dye 
attached to its 5' end (green) and a 
quencher attached to its 3' end (blue). 
When the probe is intact, the quencher 
interacts with the fluorophore, 
quenching its fluorescence. The PCR 
primers are indicated by arrows. (B) In 
the extension phase of PCR, the TaqMan 
probe is displaced by the growing DNA 
strand. (C) Subsequently, the 5' fluores¬ 
cent dye is cleaved from the probe by 
the 5' nuclease activity of the Taq poly¬ 
merase, leading to a dramatic increase 
in the fluorescence of the reporter dye 
(shown as a starburst of color from the 
dye). 


occurs because of a mismatch, then no joined probe product will be pro¬ 
duced or detected. 

Padlock Probes 

A padlock probe is an oligonucleotide that is complementary to a target 
(DNA or RNA) sequence at its 5' and 3' ends but not in its middle region 
(Fig. 9.30). When a padlock probe hybridizes to its target sequence, the 5' 
and 3' ends of the probe come into close proximity with one another and 
the middle portion loops out. Following hybridization, if the probe is 
exactly complementary to the target sequence, the 5' and 3' ends of the 
probe can be joined to one another by DNA ligase. The fact that two 
sequences (actually two ends of the same oligonucleotide) must bind to 
the target ensures a high "specificity of detection." For DNA ligation to 
occur, both sequences must hybridize perfectly to the target; this makes it 
possible to easily detect allelic sequence variants. If there is a mismatch at 
either end, no ligation occurs. Following the ligation reaction, the probe- 
target hybrid can be detected because of reporter molecules, such as 
biotin or digoxigenin, that are attached to the middle (linker) portion of 
the padlock probe. Padlock probes typically have sequences approxi¬ 
mately 15 to 20 nucleotides in length at the 5' and 3' ends that are comple¬ 
mentary to the target sequence and a middle region of approximately 50 
nucleotides. This procedure has become popular with researchers, as it is 
simpler, with fewer steps than the OLA procedure. In addition, the proce¬ 
dure requires one oligonucleotide compared to two for OLA, and it is 
amenable to automation. 

Genotyping with Fluorescence-Labeled PCR Primers 

PCR primers labeled with different fluorescent dyes can be used in the 
development of nonradioactive color-based detection systems. To distin¬ 
guish between mutant and wild-type DNAs, PCR is performed with two 
different primers. One is exactly complementary to the wild-type DNA 
and is labeled at its 5' end with rhodamine (red). The other is complemen¬ 
tary to the mutant DNA and is labeled at its 5' end with fluorescein (green) 
(Fig. 9.31). In both cases, amplification is programmed by a third, unla¬ 
beled primer that is complementary to the opposite strand. Since PCR 
amplification can occur only when the primer is exactly complementary to 
the target DNA, the presence of these three primers in the same reaction 
mixture will result in the amplification of either the wild-type or the 
mutant DNA or both, depending on which target DNAs are initially 
present to act as PCR templates. If an individual is homozygous for the 
wild-type DNA, after PCR and removal of unincorporated primer, the 
reaction mixture will fluoresce red; if he or she is homozygous for the 
mutant DNA, the reaction mixture will fluoresce green; and if he or she 
has both mutant and wild-type DNA (i.e., is heterozygous), the reaction 
mixture will fluoresce yellow. This assay can be automated and adapted 
for any single-nucleotide target site of any gene that has been sequenced. 
The problem with this technique is that it is limited to detecting an SNP 
Analysis of multiple loci is not possible, since the presence of many dif¬ 
ferent PCR primers in one reaction tube could lead to large numbers of 
cross-reactions among primer pairs, with a large number of non-specific 
PCR products being formed. 
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TaqMan Assay 

The TaqMan protocol is used to check individuals for the presence of SNPs 
that are indicative of any of a variety of genetic diseases (Fig. 9.32). Made 
popular by one particular company, it is based on the 5' nuclease activity of 
Taq polymerase, which is commonly used to amplify DNA in PCR applica¬ 
tions. To simultaneously monitor wild-type and mutant alleles, two 
TaqMan probes are utilized. Each probe is exactly complementary to either 
the wild-type or the mutant DNA sequence, and each probe has a different 
fluorescent dye attached to its 5' end. Intact probes, whether bound or 
unbound to cDNA, do not fluoresce because of the presence of the quencher 
molecule at the 3' end of the probe. As PCR proceeds from primers flanking 
the probe hybridization site, the TaqMan probe is displaced by the growing 
DNA strand, and the 5' nuclease activity of the Taq polymerase degrades 
the 5' end of the TaqMan probe, thereby releasing the fluorescent dye and 
removing it from the proximity of the quencher molecule. Thus, only 
TaqMan probes that were previously bound to target DNA will be degraded 
and subsequently fluoresce. Any mismatched probes, due to mutations in 
the region where the TaqMan probe binds, will be displaced but not 
cleaved, so they will not fluoresce. By monitoring the fluorescence at two 
different wavelengths (one for each TaqMan probe), it is possible to distin¬ 
guish the wild type, heterozygotes (carrying one mutant and one wild-type 
gene), and individuals that are homozygous for the target mutation. In fact, 
this technique may be used to assay for two or three mutations at the same 
time. The only requirements for the successful employment of the tech¬ 
nique are that (1) the precise DNA sequences of the target DNAs must be 
known and (2) the fluorescent dyes must have well-separated, nonoverlap¬ 
ping fluorescence maxima. 


SUMMARY 


T o be effective, a diagnostic test must be (1) specific for the 
target molecule, (2) sensitive enough to detect minute 
levels of the target, and (3) technically simple, with unequiv¬ 
ocal results that can be obtained readily. There are two catego¬ 
ries of molecular diagnostic techniques. One category relies on 
the specificity of an antibody for a particular antigen. The 
other uses nucleic acid hybridization or PCR to detect a spe¬ 
cific nucleic acid sequence. 

A common assay that uses antibodies is the ELISA. In this 
procedure, (1) a sample is bound to a solid support, (2) a pri¬ 
mary antibody specific for the target antigen is added and 
binds to the target antigen, (3) a secondary antibody-enzyme 
conjugate that binds to the primary antibody is added, and (4) 
a colorless substrate that is transformed into a colored com¬ 
pound by the enzyme in the conjugate is added. The appear¬ 
ance of a color response in an ELISA indicates the presence of 
the target molecule in the sample. 

ELISAs have been used for detecting various proteins, 
identifying viruses and bacteria, and determining the pres¬ 
ence of low-molecular-weight compounds in a wide range of 
biological samples. To increase the specificity of the primary 
antibody and to ensure the reliability of the antibody prepara¬ 


tion, monoclonal antibodies are often used for diagnostic 
ELISAs. 

Nucleic acid hybridization can be a highly sensitive and 
specific method for detecting the presence of a nucleic acid 
sequence in a biological sample. This method has been used to 
develop diagnostic assays for disease-causing organisms in a 
clinical setting and other organisms in the environment. 

Because a nucleic acid detection assay is directed toward a 
known DNA sequence, primers for PCR can be synthesized 
and then used to amplify the target sequence. The detection 
assay can be run in a nonradioactive system, such as the 
biotin-streptavidin-chemiluminescence protocol, or the 
amplified PCR product can be scored by gel electrophoresis. 
Also, a PCR product can be labeled with a fluorescent dye that 
is attached to the 5' end of the primer. 

One way to characterize forensic samples is by DNA fin¬ 
gerprinting. In this technique, human minisatellite DNA, 
which does not encode any proteins and is highly variable in 
sequence, is usually used as a hybridization probe. The exten¬ 
sive variability of human minisatellite DNA sequences means 
that each human being produces a unique set of hybridized 
DNA bands. 
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To characterize plant DNA, a set of arbitrary oligonucle¬ 
otide primers can be used to amplify random segments of the 
plant DNA by PCR and, after electrophoresis, to produce a 
specific set of DNA bands. This procedure is called the RAPD 
procedure. Any particular set of primer sequences will pro¬ 
duce a unique collection of amplified DNA fragments that is 
characteristic of the genomic DNA of a particular plant cul- 
tivar. 

DNA diagnostic assays can also be used to detect the pres¬ 
ence of a single-nucleotide change in a particular gene. One of 
these methods distinguishes between the ligation and the 
absence of ligation of two oligonucleotides. A single-nucle¬ 
otide mismatch at the junction of the hybridized oligonucle¬ 
otides prevents ligation. In general, the use of PCR increases 
the resolution of nucleic acid diagnostic tests and should also 
decrease the overall costs of these assays. 


The development of molecular diagnostic assays is a 
growing and dynamic field. Although the technical details of 
various tests may differ, the general principles have been 
established. At present, PCR has contributed significantly to 
overcoming the problem of the limited availability of target 
DNA. The use of PCR for probe systems has eliminated most 
concerns about the sensitivity of the detection signal, with the 
result that nonradioactive chromogenic, chemiluminescent, or 
fluorescent systems can be used reliably for certain assays. 
Moreover, in a number of tests, PCR treatment and electro¬ 
phoretic analysis are sufficient to determine the presence of 
either a genetic mutation or an infectious agent in the targeted 
sample. Undoubtedly, many novel DNA-based systems will 
be created for the diagnosis of most, if not all, of the common 
genetic, infectious, and malignant diseases. 
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REVIEW QUESTIONS 


1. Briefly describe how the change in the human P-globin 
gene that gives rise to sickle-cell anemia can be detected by 
using PCR. 

2. Describe and discuss the PCR/OLA detection protocol. 

3. What is an ELISA? How does it work? 

4. Describe several types of nonradioactive DNA labels. What 
are the advantages of nonradioactive detection procedures? 

5. You have been given the task of developing a simple, sensi¬ 
tive, and reproducible diagnostic procedure for a double- 
stranded DNA virus that is devastating a local cattle 
population. Because effective treatment of this disease depends 
on early and correct diagnosis, you need to be able to detect 


the very low levels of this virus that are present in infected 
animals before the onset of disease symptoms. Briefly explain 
how you would proceed and why you have chosen a partic¬ 
ular course of action. 

6. For diagnostic assays, what is meant by sensitivity, speci¬ 
ficity, and simplicity? 

7. How is Chagas disease currently diagnosed? How might 
the existing procedures be improved? 

8. What is a molecular beacon probe, and how does it work? 

9. What is DNA fingerprinting, and how is it used to charac¬ 
terize traces of DNA in forensic samples? 
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10. What is the RAPD procedure, and how can it be used to 
characterize plant cultivars? 

11. What is a padlock probe, and how is it used? 

12. What are monoclonal antibodies? How are they different 
from polyclonal antibodies? 

13. Briefly, explain how the HAT selection for hybridomas 
works. 

14. How can molecular beacon probes be used to (1) detect 
several genes in the same sample and (2) characterize an indi¬ 
vidual's genotype for a particular genetic disease, such as 
sickle-cell anemia? 

15. Why is it difficult to screen an individual's chromosomal 
DNA to assess whether he or she carries a mutation of the 
CFTR gene that leads to cystic fibrosis? 


16. Why is it useful to simultaneously employ several dif¬ 
ferent-color fluorescent proteins? 

17. How would you develop microbial biosensors to detect 
environmental contaminants? 

18. What is real-time PCR? What is it used for? How does it 
work? 

19. What is immunoquantitative real-time PCR, and how 
does it work? 

20. What is a padlock probe, and how does it work? 

21. What is the TaqMan assay procedure, and how can it be 
used to assay SNPs? 
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Protein Therapeutics 


P RIOR TO THE DEVELOPMENT OF RECOMBINANT DNA TECHNOLOGY, most 
human protein pharmaceuticals were available in only limited quanti¬ 
ties, they were extremely costly to produce, and, in a number of cases, 
their biological modes of action were not well characterized. When recom¬ 
binant DNA technology was first developed, it was heralded as a means of 
producing a whole range of possible human therapeutic agents in sufficient 
quantities for both efficacy testing and eventual human use. This forecast 
has turned out to be true. Today, the "genes" (mostly complementary DNAs 
[cDNAs]) for several thousand different proteins that are potential human 
therapeutic agents have been cloned. Most of these sequences have been 
expressed in mammalian as well as bacterial host cells, and currently more 
than 500 are undergoing clinical testing with human subjects for the treat¬ 
ment of various diseases. More than 250 of these "biotechnology drugs" 
have been approved for use in the United States or the European Union 
(Table 10.1). However, it will be several years before many of the other pro¬ 
teins are commercially available, because medical products must first be 
tested rigorously in animals and then undergo thorough human trials, 
which can last for several years, before being approved for general use. 
However, the financial incentive for pharmaceutical companies is consider¬ 
able. It has been estimated that in 2006 the annual global market for human 
recombinant protein drugs was about $60 billion. Ten "blockbuster" drugs 
constitute nearly half of these sales. For example, in 2006, rituximab 
(Rituxan), a monoclonal antibody used to treat individuals with non- 
Hodgkin lymphoma, generated nearly $4 billion in sales, while various 
forms of recombinant human insulin generated around $2.5 billion. 

The development of preventive procedures and treatments for human 
diseases was the outstanding contribution of medicine and science to 
human well-being in the 20th century. This process, however, is a contin¬ 
uous one. So-called old diseases (e.g., tuberculosis) can reappear if preven¬ 
tive measures are relaxed or if antibiotic-resistant organisms arise. The idea 
of using antibodies as therapeutic agents has come to fruition in the past 
several years, and specific antibodies are being tested to attack toxins, bac¬ 
teria, viruses, and even cancer cells. An antibody may be viewed as a 
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TABLE 10.1 Examples of recombinant proteins that have been approved for human use in the United States or the 
European Union 


Alglucosidase a 
Anakinra 

Antihemophilic factor 
Darbepoetin a 
Dibotermin 
DNase I 
Drotrecogin a 
Erythropoietin 
Factor Vila 
Factor VIII 
Factor IX 

Follicle-stimulating hormone 

a-Galactosidase 

Galsulfase 

Glucagon 

p-Glucocerebrosidase analogue 


Granulocyte-macrophage 


Interferon analogues 
Interleukin-2 
Interleukin-2 analogues 
Interleukin-11 
Interleukin-11 analogue 
Keratinocyte growth factor 
Laronidase 

Novel erythropoiesis-stimulating 


colony-stimulating factor 
Hirudin 

Human growth hormone 

Human growth hormone analogue 

Hyaluronidase 

Insulin 

Insulin analogue 

Insulin-like growth factor 1 

Interferon-a2a 

Interferon-a2b 

Interferon-aN3 

Interferon-pia 

Interferon-pib 

Interferon-ylb 

Interferon-N 


protein 

Osteogenic protein 
Platelet-derived growth factor 
Stem cell factor 
Tissue plasminogen activator 
Thyrotropin-a 

Truncated tissue plasminogen 


activator 


target-seeking missile or as a magic bullet that either can directly neutralize 
an offending agent or, if equipped with a warhead or poison arrow, can 
destroy a specific target cell. 

Pharmaceuticals 

Isolation of Interferon cDNAs 

A number of different strategies have been used to isolate either the genes 
or cDNAs for human proteins. In some cases, the target protein is isolated 
and a portion of the amino acid sequence is determined. From this informa¬ 
tion, a DNA coding sequence is deduced. The appropriate oligonucleotide 
is synthesized and used as a DNA hybridization probe to isolate the gene 
or cDNA from either a genomic or a cDNA library. Alternatively, antibodies 
are raised against the purified protein and used to screen a gene expression 
library. For human proteins that are synthesized primarily in a single 
tissue, a cDNA library from the messenger RNA (mRNA) of that tissue is 
enriched for the target DNA sequence. For example, the major protein syn¬ 
thesized by the islets of Langerhans of the pancreas is insulin; 70% of the 
mRNA fraction isolated from these cells encodes insulin. 

Before the completion of sequencing of the human genome in 2001, it 
was often necessary to devise innovative approaches to isolate human 
genes or cDNAs, especially when the proteins encoded were found in very 
low concentrations or when the site of synthesis was not known. The 
human interferon (IFN) proteins, which include IFN-a, IFN-p, and IFN-y, 
are naturally occurring proteins, each one with somewhat different bio¬ 
logical activity. When the IFN cDNAs were initially isolated in the early 
1980s, very little was known about the encoded proteins (IFN was origi¬ 
nally thought to be a single protein), so a novel scheme had to be devised 
to overcome the scarcity of both the mRNAs and the proteins. Also, at the 
time, Escherichia coli expression vectors for eukaryotic cDNAs were not 
readily available, so it was necessary to devise an indirect scheme to iso- 
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late IFN cDNA. The isolation of IFN cDNAs included the following steps 
(Fig. 10.1). 

1. Size-fractionated mRNA was isolated from human leukocytes, 
reverse transcribed, and inserted into the PstI site of plasmid 
pBR322. 

2. The approximately 6,000 clones that were produced following 
transformation of E. coli were divided into 12 pools of 512 clones 
each. Pools of clones, rather than individual clones, were tested to 
speed up the identification process. 

3. The plasmid DNA from each pool was hybridized to a crude IFN 
mRNA preparation. 

4. The input mRNA that hybridized to the plasmid DNA was sepa¬ 
rated from the cloned DNA-mRNA hybrids and translated in a 
cell-free protein synthesis system. 

5. Each translation mixture was then assayed for IFN antiviral 
activity. The pools that showed IFN activity contained at least one 
clone with a cDNA that hybridized to IFN mRNA. 

6. Positive pools were divided into eight subgroups of 64 clones each 
and retested (i.e., steps 3 to 5 were repeated). This subgrouping 
process was repeated until a clone with the complete cDNA for a 
human IFN was identified. 

Subsequently, whenever large quantities of the IFN were required, the 
IFN cDNAs could be subcloned into an E. coli expression vector and 
expressed at high levels. 

Human Interferons 

After the isolation of the first IFN gene, researchers found that there are a 
number of different IFNs. On the basis of chemical and biological proper¬ 
ties, the IFNs can be classified, as noted above, into three different groups: 
IFN-a, IFN-(3, and IFN-y. The proteins IFN-a and IFN-p are synthesized in 
cells that have been exposed to viruses or viral RNA; IFN-y is synthesized 
in response to cell growth-stimulating agents. IFN-a is encoded by a family 


FIGURE 10.1 Overview of the protocol used to isolate IFN cDNA. 
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of 13 different (but similar) genes, IFN-p is encoded by two genes, and 
IFN-y is encoded by a single gene. The IFN-a subtypes have different 
specificities. For example, the antiviral activities of IFN-a2 and IFN-al are 
approximately the same when assessed with a virus-challenged bovine cell 
line, but IFN-a2 is seven times more effective than IFN-al when human 
cells are treated with virus. IFN-a2 is 30 times less effective than IFN-al 
when mouse cells are used in this assay. 

Interferon gene shuffling. Several research groups have attempted to engi¬ 
neer IFNs with combined properties based on different members of the 
IFN-a gene family that vary in the extents and specificities of their antiviral 
activities. Theoretically, this can be achieved by splicing a portion of one 
IFN-a gene with a DNA sequence from a different IFN-a gene to create, 
after translation, a hybrid protein that exhibits novel properties, i.e., prop¬ 
erties different from either of the contributing genes. 

In one study, hybrid genes from IFN-a2 and IFN-a3 were constructed 
in an effort to create proteins with novel IFN activities. Comparison of the 
sequences of the two IFN-a cDNAs indicated that they had common 
restriction sites at positions 60, 92, and 150. Digestion of both cDNAs at 
these sites and ligation of the DNA fragments yielded a number of hybrid 
derivatives of the original genes (Fig. 10.2). These hybrids were expressed 
in E. coli, and the resultant proteins were purified and examined for various 
biological functions. When tested for the extent of protection of mamma¬ 
lian cells in culture against viral infection, some of the hybrid IFNs were 
found to have greater activity than the parental molecules. In addition, 
many of the hybrid IFNs induced test cells to synthesize (2'-5')-oligoisoad- 
enylate synthetase. This enzyme generates (2'-5')-linked oligonucleotides, 
which in turn activate a latent cellular endoribonuclease that cleaves viral 
mRNA. Other hybrid IFNs had an antiproliferative activity against various 
human cancers that was greater than that of either of the parental mole¬ 
cules. More recently, additional hybrid IFN molecules have been generated 
by a variation of the above-mentioned procedure. In this case, the entire 
IFN-a cDNA family was PCR amplified and then digested with DNase into 
small DNA fragments (~50 to 60 nucleotides long) before the fragments 
were shuffled and amplified by PCR (Fig. 10.3). This procedure works 
because the PCR mixture contains many overlapping single-stranded 
DNAs that can act as PCR primers (see "Chemical Synthesis of DNA" in 
chapter 4). Following testing of the many shuffled IFN cDNAs, it is pos¬ 
sible to select hybrid IFNs with vastly improved antiviral or antiprolifera¬ 
tive activities. In fact, some hybrid IFNs have recently undergone successful 
clinical trials (Box 10.1) and have been approved for use as human thera¬ 
peutic agents. The strategy for creating hybrid IFNs can also be applied to 
other gene families whose products have therapeutic potential. 

Longer-acting interferons. Hepatitis C virus infection is one of the most 
common causes of liver disease, which affects nearly 200 million people 
worldwide. Many of these individuals eventually develop either cirrhosis 
of the liver or hepatocellular carcinoma. Therapeutic agents that maximize 
early antiviral response and maintain viral suppression throughout the 
course of therapy have the best chance of achieving lasting eradication of 
the virus from an infected individual. One effective treatment for hepatitis 
C includes the combined use of the antiviral chemical compound ribavirin 
with IFN-a. Longer-acting IFNs are needed to minimize the side effects 
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FIGURE 10.2 Structure of the IFN-a2 and IFN-a3 genes and four hybrid genes. 
Comparison of the sequences of the IFN-a2 and IFN-a3 genes shows shared restric¬ 
tion enzyme sites (REl, RE2, and RE3). Digestion of the genes at the indicated 
restriction sites and ligation of the resulting fragments generate a number of dif¬ 
ferent hybrid IFN genes, of which four possibilities are shown. 


from IFN treatment, lower the required dosage, and decrease the required 
frequency of the treatments. One approach to creating long-acting IFNs 
includes PEGylation. PEGylation entails covalently attaching polyethylene 
glycol (PEG) to proteins. The binding is typically achieved by incubation of 
a reactive derivative of PEG with the target protein molecule. PEGylation 
increases the size of IFN in solution, thereby prolonging its circulatory time 
by reducing its renal clearance. A simpler means of generating longer- 
acting IFNs is to fuse an IFN gene with the gene for a stable protein, such 
as human serum albumin, that, after translation, produces a stable hybrid 
protein. This combination has been called the albumin-interferon hybrid 
molecule (Zalbin, formerly Albuferon), and it retains all of the biological 
activity of the native IFN molecule (Fig. 10.4). Native IFN levels in the 
blood of a treated patient typically decrease rapidly, so that 2 days after 
administration, they are undetectable. On the other hand, with the albumin- 
interferon hybrid molecule, the drug (in this case, the fusion protein) in 
serum remains at a therapeutically effective level for a much longer time, 
so that it needs to be administered no more than once every 2 weeks. The 
initial clinical trials of the albumin-interferon hybrid molecule have all 
been positive. Phase III clinical trials of the albumin-interferon hybrid mol¬ 
ecule began in late 2006. If these trials are successful, then the albumin- 
interferon hybrid molecule may be available for general use some time in 
2010 . 

Human Growth Hormone 

Human growth hormone (somatotropin) is a 191-amino-acid pituitary 
protein with a molecular mass of 22,125 daltons (Da) that stimulates the 
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FIGURE 10.3 Construction of hybrid IFN-a genes. The resultant IFN-a gene-shuffled 
libraries are tested for antiproliferative and antiviral activities. 


production of insulin-like growth factor 1. Insulin-like growth factor 1 is 
an essential component of the promotion of growth in children, and in 
adults, it controls metabolism. Human growth hormone was one of the 
first therapeutic proteins in the world to be approved for human use. The 
recombinant form of the protein is produced in E. coli and is identical to 
native pituitary-derived human growth hormone. Infants and children 
who lack sufficient endogenous levels of human growth hormone, patients 
with chronic renal insufficiency (defective kidneys), and individuals with 
Turner syndrome respond to treatment with growth hormone, which 
stimulates tissue and bone growth, increases protein synthesis and min¬ 
eral retention, and decreases body fat storage. 

The first recombinant growth hormone was called somatrem (Protropin); 
it was produced and marketed by Genentech beginning in 1985. It had an 
amino acid sequence that was identical to that of human growth hormone, 
except that there was an extra methionine residue at the N-terminal end of 
the peptide chain (which was thought to prolong its half-life). It was dis¬ 
continued in the late 1990s. 

Treatment of children with human growth hormone typically entails 
daily injections during the years when the child is growing. The cost of the 
treatment varies depending on the country and the size of the child but is 
generally approximately $10,000 to $30,000 per year. In addition, in 2004, 
the U.S. Food and Drug Administration (FDA) approved the use of recom¬ 
binant human growth hormone for individuals whose short stature was 
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BOX 10.1 


Clinical Trials 

A fter the discovery of a new drug 
or course of treatment, and before 
it is made available to the public, it is 
essential that extensive studies and 
analysis of its safety and efficacy be 
conducted and then reviewed by an 
impartial agency. Although a large 
number of countries have developed 
their own approaches to test new ther¬ 
apeutics, the "gold standard" for clin¬ 
ical trials is the set of requirements 
established by the FDA. This process 
is briefly described here. 

The preclinical phase of therapeutic 
drug development (i.e., the initial 
stage of the process of bringing a new 
therapeutic agent to market) entails 
thorough and extensive laboratory 
research on the mode of action, struc¬ 
ture, and other biochemical and phys¬ 
ical properties of a potential new 
drug. Scientists working at universi¬ 
ties, research institutes, and drug and 
biotechnology companies are continu¬ 
ally discovering and testing new mol¬ 
ecules, as well as new uses for known 
compounds. However, it is impossible 
to know with any certainty which ave¬ 
nues of research will eventually bear 
fruit. Once a promising result has been 
obtained in the laboratory, and it has 
been shown to be reproducible, suffi¬ 
cient quantities of a highly purified 
version of the potential therapeutic 
compound must be produced so that 
it can be tested on small animals, such 
as mice. If the animal tests are positive 
and there is no evidence of any 
serious side effects, the organization 
seeking to commercialize the research 


files an "investigational new drug" 
application with the FDA. This is an 
application to begin the process of 
clinical trials. Based on the preclinical 
research data that are provided, about 
85% of these applications are 
approved. 

Clinical trials are conducted in 
three distinct phases (described 
below), generally requiring a total of 
about 7 to 9 years at a cost of approxi¬ 
mately $75 million to $100 million to 
complete. At each stage, various com¬ 
pounds are dropped from consider¬ 
ation based on the results obtained. 
Eventually, approximately 20% of the 
compounds that looked promising 
based on preclinical results will, after 
a careful review of all the data, finally 
be approved. This slow and expensive 
process is claimed to be "the most 
effective method ever devised to 
assess the efficacy of a treatment." 

The three phases of the FDA review 
process are as follows. 

Phase I: With between 10 and 100 
healthy people, the safety of the 
drug and, starting with very low 
doses, the highest dosages that can 
be administered are assessed. 

When there is a chance that serious 
side effects may result, individuals 
affected with the disorder that the 
drug is designed to alleviate may 
be used. 

Phase II: With 50 to 500 affected 
patients, the optimal dosing reg¬ 
imen is determined. A control 
group is used so that it is possible 
to clearly distinguish between the 
effects of the drug and the natural 


remission of the disease. The use of 
a control group also helps to delin¬ 
eate real from apparent side effects 
of the treatment. 

Phase III: Depending upon the dis¬ 
ease, approximately 300 to 30,000 
patients who have the disease are 
tested. After it is established that 
the drug is not harmful and the 
optimal dosing regimen has been 
determined, the effectiveness of the 
treatment needs to be proven. 

The requirement for careful and 
thorough clinical trials ensures both 
the safety and efficacy of approved 
drugs. However, since the costs of 
both the preclinical research and the 
clinical trials are borne by pharmaceu¬ 
tical companies, this system makes it 
difficult for small companies that dis¬ 
cover a new product to eventually 
bring that product to market without 
the involvement of a large corporation 
with significant financial resources. 
Furthermore, the high cost of clinical 
trials and the low probability of a new 
drug's being approved mean that it is 
unlikely that therapeutic agents will 
even be considered for clinical trials 
unless there is a strong possibility that 
there will be significant financial gains 
from the sale of that agent. This finan¬ 
cial disincentive may discourage 
research on therapeutic agents for dis¬ 
eases that either affect only a relatively 
small number of people or affect only 
populations in poor, underdeveloped 
countries. 


caused by a variety of medical conditions other than human growth hor¬ 
mone deficiency. 

The strategy of designing a protein by either functional domain shuf¬ 
fling or directed mutagenesis can be used to augment or constrain its mode 
of action. For example, native human growth hormone binds to both 
growth hormone and prolactin receptors that occur on a number of dif¬ 
ferent cell types. To avoid unwanted side effects during therapy, it is desir¬ 
able that human growth hormone bind only to growth hormone receptors. 
Because the segment of the growth hormone molecule that binds to the 
growth hormone receptor overlaps but is not identical to the portion of the 
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FIGURE 10.4 Schematic representation of the synthesis of the albumin-interferon 
fusion protein (Zalbin, formerly Albuferon), which includes human serum albumin 
(HSA) (red) at the N terminus and human IFN-a2b (blue) at the C terminus. 
Modified from http://www.hgsi.com/albinterferon-alfa-2b.html with permission. 


molecule that binds to the prolactin receptor, it should be possible to selec¬ 
tively decrease the binding to the prolactin receptor. 

Site-specific mutagenesis of the cloned human growth hormone cDNA 
was used to change some of the amino acid side chains that act as ligands 
for Zn 2+ (i.e., His-18, His-21, and Glu-174), because the ion is required for 
the high-affinity binding of human growth hormone to the prolactin 
receptor (Fig. 10.5). As hoped, these modifications yielded human growth 
hormone derivatives that bound to the growth hormone receptor but not to 
the prolactin receptor. These derivatives are being tested for safety and 
efficacy in humans. 


FIGURE 10.5 Schematic representation of native and modified human growth hor¬ 
mone. Oligonucleotide-directed mutagenesis was used to alter human growth 
hormone so that it no longer bound to the prolactin receptor but retained its speci¬ 
ficity for the growth hormone receptor. 
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As a consequence of its relatively short half-life in plasma, human 
growth hormone therapy currently requires subcutaneous injection once a 
day. This treatment is both inconvenient and expensive. Therefore, it would 
be advantageous to have a long-lasting form of human growth hormone. 
To this end, the extracellular domain of the human growth hormone 
receptor was fused to human growth hormone using a 20-amino-acid-long 
linker peptide consisting of four repeats of the amino acids Gly 4 Ser (Fig. 
10.6). This construct has a very strong tendency to dimerize as the growth 
hormone moiety from one molecule binds with the receptor portion of 
another molecule. When this growth hormone construct was tested in rats, 
a single injection promoted growth for 10 days (compared to the usual 
requirement in rats for daily injections). It is thought that the dimerization 
of the growth hormone construct stabilizes human growth hormone in 
vivo so that it is cleared from plasma approximately 300 times more slowly 
than free human growth hormone. Under these conditions, the active 
monomeric form (Fig.10.6A) is slowly released from the inactive dimeric 
growth hormone (Fig.10.6B), allowing it to bind to the growth hormone 
receptor (Fig.10.6C). This experiment is certainly intriguing. It remains to 
be determined whether humans respond in a similar manner to the 
dimerized complex. 

Another method that has been devised to prolong the active lifetime of 
human growth hormone includes fusing the coding sequences for the 
C-terminal end of human growth hormone (~22 kDa) with the N-terminal 
end of human serum albumin (~67 kDa). This fusion protein is called 
Albutropin (Fig. 10.7); it has a molecular mass of ~89 kDa and is produced 
by a strain of yeast that has been genetically modified so that the proteins 
that it produces have a minimal number of posttranslational modifications. 
The stabilization of the human growth hormone portion of Albutropin 
reflects the stability of human serum albumin, which has a half-life in 
serum of about 19 days. Albutropin has been shown to be effective in both 

FIGURE 10.6 Derivatization of growth hormone by coupling it to a portion of the 
growth hormone receptor using a 20-amino-acid peptide. (A) Monomeric deriva¬ 
tive; (B) dimeric derivative; (C) monomeric derivative bound to a growth hormone 
receptor. 
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FIGURE 10.7 Schematic representation of 
the fusion protein Albutropin, which 
includes human serum albumin (red) 
at the N terminus and human growth 
hormone (blue) at the C terminus. 


rats and monkeys, in which high levels of the protein in serum were 
observed 5 days after it was administered (Fig. 10.8). Moreover, Albutropin 
has successfully completed phase I clinical trials. 

Tumor Necrosis Factor Alpha 

While a number of studies have clearly shown that tumor necrosis factor 
alpha (TNF-a) is a potent antitumor agent, it has not been widely used in 
this capacity because of its severe toxicity. If TNF-a could be delivered 
directly to its site of action, i.e., the tumor, then lower doses could be used 
and the unwanted side effects would be diminished. To develop a version 
of TNF-a with tumor specificity, DNA encoding the peptide Cys-Asn-Gly- 
Arg-Cys-Gly (which targets a tumor cell surface protein) was fused to 
TNF-a DNA. The fusion protein contained a 6-amino-acid extension at its 
N-terminal end (Fig. 10.9). In mice, the cytotoxic activities of Cys-Asn-Gly- 
Arg-Cys-Gly-TNF-a and TNF-a were identical, indicating that the addi¬ 
tional amino acids did not prevent protein folding, combining of three 
subunits to form a trimer, or binding to receptors. However, the modified 
version of TNF-a was 12 to 15 times more effective at inhibiting tumor 
growth than the unmodified form. Moreover, a higher percentage of mice 
with lymphoma survived after treatment with the modified factor (Fig. 
10.10). In addition, all the mice that were treated with the modified factor 
and survived for 30 days survived a second and third challenge with 
mouse lymphoma cells. These data indicate that there is a significant ben¬ 
efit, at least in mice, to fusing TNF-a with a short targeting peptide. 
Nevertheless, this work must be regarded as preliminary until its efficacy 
is demonstrated in humans. 


FIGURE 10.8 Intravenous concentration in monkeys of either human growth hor¬ 
mone or Albutropin following subcutaneous injection. 
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FIGURE 10.9 Schematic representation of TNF-a (blue) without (A) and with (B) a 
6-amino-acid peptide (red) fused to its N terminus. The protein structure shown is 
hypothetical; only the numbers of amino acid residues (shown as circles) are accu¬ 
rately depicted. 


Enzymes 

DNase I 

Cystic fibrosis is one of the most common fatal hereditary diseases among 
Europeans and their descendants, with approximately 30,000 diagnosed 
cases in the United States and another 23,000 cases in Canada and Europe. 
It is estimated that a mutant cystic fibrosis gene is carried by 1 in 29 
Europeans, 1 in 65 African Americans, and 1 in 150 Asians. Individuals 
with cystic fibrosis are highly susceptible to bacterial infections in their 
lungs. Antibiotic treatment of patients who have these recurring infections 
eventually leads to the selection of antibiotic-resistant bacteria. The pres¬ 
ence of bacteria, some alive and some lysed, contributes to the accumula¬ 
tion of a thick mucus in the lungs of these patients, making breathing very 
difficult and acting as a source for further infection. The thick mucus in the 


FIGURE 10.10 Survival of lymphoma-bearing mice following treatment with 3 pig of 
either TNF-a or Cys-Asn-Gly-Arg-Cys-Gly-TNF-a (CNGRCG-TNF) as a function 
of the number of days after treatment. 
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lungs is the result of the combination of the alginate that is secreted by the 
living bacteria, the DNA that is released from lysed bacterial cells, and 
degenerating leukocytes that accumulate in response to the infection, as 
well as filamentous actin derived from the cytoskeletons of damaged epi¬ 
thelial cells (Fig. 10.11). To address this problem, scientists at the U.S. bio¬ 
technology company Genentech isolated the gene for the human enzyme 
deoxyribonuclease I (DNase I) and subsequently expressed the gene in 
Chinese hamster ovary (CHO) cells in culture. DNase I can hydrolyze long 
polymeric DNA chains into much shorter oligonucleotides. The purified 
enzyme was delivered in an aerosol mist to the lungs of patients with cystic 
fibrosis. The DNase I decreased the viscosity and adhesivity of the mucus 
in the lungs and made it easier for these patients to breathe. While this 
treatment is not a cure for cystic fibrosis, it nevertheless relieves the most 
severe symptom of the disease in most patients. The enzyme was approved 
for use by the FDA in 1994; it had sales of approximately $100 million in 
2000 . 

The monomeric form of actin binds very tightly to DNase I (inhibitor 
constant [KJ = ~1 nM) and inhibits its ability to cleave DNA (Fig. 10.12). 
This interaction limits the effectiveness of DNase I as a therapeutic agent. 
On the basis of X-ray crystallographic studies, it was possible to predict 
which amino acid residues of DNase I interacted with actin and were there¬ 
fore possible targets for change by directed mutagenesis. For example, 
changing amino acid 144 from alanine to arginine or amino acid 65 from 
tyrosine to arginine decreased the binding of DNase I to actin up to 10,000- 
fold. In addition, the actin-resistant mutants had 10- to 50-fold more DNase 
I activity than the native enzyme. It is not known whether any additional 
benefit might be realized by combining the amino acid changes from sev¬ 
eral actin-resistant mutants. The clinical efficacy of a DNase I mutant 
enzyme that does not bind actin still remains to be demonstrated. 

Alginate Lyase 

Alginate is a polysaccharide polymer that is produced by a wide range of 
seaweeds and both soil and marine bacteria. Alginate is composed of 
chains of the sugars (3-D-mannuronate and a-L-guluronate. The properties 
of a particular alginate depend on the relative amounts and distribution of 
these two saccharides. For example, stretches of a-L-guluronate residues 
form both interchain and intrachain cross-links by binding calcium ions, 
and the (3-D-mannuronate residues bind other metal ions. The cross-linked 
alginate polymer forms an elastic gel. In general, the structure of an alg- 


FIGURE 10.11 Schematic representation of a portion of a human lung occluded by 
a combination of live alginate-secreting bacterial cells, lysed bacterial cells, and 
leukocytes and their released DNA. This matrix may be digested by alginate lyase 
or DNase I. 
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inate polymer is related to its viscosity, which is in turn directly propor¬ 
tional to its molecular size. 

The excretion of alginate by mucoid strains of Pseudomonas aeruginosa 
that infect the lungs of patients with cystic fibrosis significantly contributes 
to the viscosity of the mucus in the airways. Once mucoid strains of P. 
aeruginosa have become established in the lungs of cystic fibrosis patients, 
it is almost impossible to eliminate them by antibiotic treatment. This is 
because the bacteria form biofilms in which the alginate prevents the anti¬ 
biotics from coming into contact with the bacterial cells. In one experiment, 
it was shown that the addition of alginate lyase, which can liquefy bacterial 
alginate, together with or prior to antibiotic treatment, significantly 
decreased the number of bacteria found in biofilms (Fig. 10.13). This result 
suggests that, in addition to the DNase I treatment, depolymerization of 
the alginate would help clear blocked airways of individuals with cystic 
fibrosis. 

An alginate lyase gene has been isolated from a Flavobacterium species, 
a gram-negative soil bacterium that is a strong producer of this enzyme. A 
Flavobacterium clone bank was constructed in £. coli and screened for alg¬ 
inate lyase-producing clones by plating the entire clone bank onto solid 
medium containing alginate. Following growth, colonies that produced 
alginate lyase formed a halo around the colony when calcium was added 
to the plate (Fig. 10.14). In the presence of calcium, all of the alginate in the 
medium, except in the immediate vicinity of an alginate lyase-positive 
clone, becomes cross-linked and opaque. Since hydrolyzed alginate chains 
do not form cross-links, the medium surrounding an alginate lyase-positive 
clone is transparent. Analysis of a cloned DNA fragment from one of the 
positive colonies revealed an open reading frame encoding a polypeptide 
with a molecular mass of approximately 69,000 Da. Detailed biochemical 



FIGURE 10.12 Schematic representation of 
the ternary complex of human DNase I, 
actin, and DNA. 


FIGURE 10.13 Time courses of the killing of bacteria in a biofilm with and without the 
addition of alginate lyase. 
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Alginate lyase 
producer 


FIGURE 10.14 Schematic representation 
of the detection of an alginate lyase- 
producing clone from a clone bank of a 
Flavobacterium sp. in E. coli. The alg¬ 
inate that is present in the growth 
medium is digested by alginase secreted 
by an E. coli clone. The alginate in the 
vicinity of such a colony is not cross- 
linked when calcium is added and 
instead produces a clear zone (halo) 
surrounding the colony. 


and genetic studies indicated that this polypeptide is a precursor of the 
three different alginate lyases produced by the Flavobacterium sp. (Fig. 

10.15) . After the 69,000-Da precursor is produced, a proteolytic enzyme 
cleaves off an N-terminal peptide of about 6,000 Da. The 63,000-Da protein 
can lyse both bacterial and seaweed alginates. Cleavage of the 63,000-Da 
protein yields a 23,000-Da enzyme that depolymerizes seaweed alginate 
and a 40,000-Da enzyme that is effective against bacterial alginate. To pro¬ 
duce large amounts of the 40,000-Da enzyme, the DNA corresponding to 
the enzyme was amplified by the polymerase chain reaction (PCR) and 
then inserted into a Bacillus subtilis plasmid vector fused to a B. subtilis 
a-amylase leader peptide to direct the secretion of the protein and placed 
under the transcriptional control of a penicillinase gene promoter (Fig. 

10.16) . Transformation of B. subtilis cells with this construct yielded colo¬ 
nies with large halos on solid medium containing alginate after calcium 
was added. When these transformants were grown in liquid medium, the 
recombinant alginate lyase was secreted into the culture broth. Further 
tests showed that the enzyme efficiently liquefied alginates that were pro¬ 
duced by mucoid strains of P. aeruginosa that had been isolated from the 
lungs of patients with cystic fibrosis. Additional studies are necessary to 
determine whether recombinant alginate lyase is an effective therapeutic 
agent. 

Phenylalanine Ammonia Lyase 

The human genetic disease phenylketonuria results from the impaired 
functioning of the enzyme phenylalanine hydroxylase. In the United States, 
approximately 1 of every 12,000 newborns has phenylketonuria. When 
phenylalanine hydroxylase, which oxidizes phenylalanine to tyrosine, is 
deficient, the normal cognitive development of an individual is impaired 
and mental retardation ensues due to a buildup of phenylalanine. Following 
diagnosis of phenylketonuria, either prenatally or shortly after birth, the 


FIGURE 10.15 Processing of the recombinant Flavobacterium alginate lyase protein 
precursor in E. coli. A 6-kDa peptide is removed from the N terminus of the 69-kDa 
precursor to yield a 63-kDa protein that can depolymerize alginate from both sea¬ 
weed and bacteria. A second cleavage event converts the 63-kDa protein into a 
23-kDa protein that is active against seaweed alginate and a 40-kDa protein that 
hydrolyzes bacterial alginate. 
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Penicillinase a-Amylase 

promoter leader peptide 40,000-Da alginate lyase 


FIGURE 10.16 DNA construct encoding the 40,000-Da alginate lyase. The leader pep¬ 
tide from a B. subtilis a-amylase gene is fused to the N terminus of the alginate lyase 
coding sequence. The construct is under the transcriptional control of a B. subtilis 
penicillinase gene expression system. 


treatment entails a controlled semisynthetic diet with low levels of pheny¬ 
lalanine through infancy and possibly for life. A possible alternative treat¬ 
ment would be the administration of the enzyme phenylalanine hydroxylase. 
Unfortunately, phenylalanine hydroxylase is a multienzyme complex that 
is not very stable and requires a cofactor for activity. On the other hand, 
phenylalanine ammonia lyase, which converts phenylalanine to ammonia 
and trans -cinnamic acid (Fig. 10.17), is a stable enzyme that does not require 
a cofactor and could potentially prevent the accumulation of phenylalanine 
in phenylketonuria patients. To test this concept, the gene for phenylala¬ 
nine ammonia lyase from the yeast Rhodosporidium toruloides was cloned 
and overexpressed in E. coli. Preclinical studies were conducted with mice 
that were defective in producing phenylalanine ammonia lyase and there¬ 
fore accumulated phenylalanine. With these mice, plasma phenylalanine 
levels were lowered when phenylalanine ammonia lyase was injected 
intravenously or encapsulated enzyme was administered orally. Thus, at 
least in mice, phenylalanine ammonia lyase is an effective substitute for 
phenylalanine hydroxylase, and the orally delivered enzyme is sufficiently 
stable to survive the mouse gastrointestinal tract and still function. 
Although this report is preliminary, a combination of oral enzyme therapy 
with phenylalanine ammonia lyase and a less stringent low-phenylalanine 
diet might serve to improve the quality of life of individuals affected with 
this disease. 

oCfAntitrypsin 

The processing of a number of different pathogenic bacterial or viral pre¬ 
cursor proteins by human proteases occurs when the protease recognizes 
the amino acid sequence -Arg-X-Lys /Arg-Argi-, with peptide bond 
cleavage on the C-terminal side of the C-terminal Arg (as indicated by the 
arrow), where X is any of the 20 common amino acids. Since this pro¬ 
cessing step is common to several infectious agents, a therapeutic agent 


FIGURE 10.17 Products of the conversion of phenylalanine by phenylalanine hydrox¬ 
ylase and phenylalanine ammonia lyase. 
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FIGURE 10.18 Schematic representation 
of oq-antitrypsin inhibiting the prote¬ 
olytic cleavage of pathogenic precursor 
proteins by human proteases. 


that targeted the processing enzyme and blocked its activity might act as 
a broad-spectrum antibacterial and antiviral agent (Fig. 10.18). When a 
variant of human oq-antitrypsin was genetically engineered and tested in 
tissue culture experiments, the protein blocked the processing of human 
immunodeficiency virus (FIIV) type 1 glycoprotein gpl60, as well as 
measles virus protein F 0 , and consequently, in both cases, the production 
of infectious viruses. When the oq-antitrypsin variant was added to cell 
cultures, it blocked the production of human cytomegalovirus, a major 
cause of illness and death in organ transplant recipients and AIDS patients. 
The oq-antitrypsin variant is both potent and selective. Against human 
cytomegalovirus, it is at least 10-fold more effective than any currently 
used viral inhibitory agent. Its efficacy has been demonstrated in cell cul¬ 
ture, but it remains to be determined if the strategy is effective with whole 
animals. 

Glycosidases 

The ABO blood group system is based upon the presence or absence of 
specific carbohydrate residues on the surfaces of erythrocytes, endothelial 
cells, and some epithelial cells. The monosaccharide that determines blood 
group A is a terminal a-l,3-linked N-acetylgalactosamine, while the corre¬ 
sponding monosaccharide of blood group B is a-l,3-linked galactose (Fig. 
10.19). Group O cells lack both of these monosaccharides at the ends of 


FIGURE 10.19 Digestion of the monosaccharides that determine blood groups A and 
B to obtain the H antigen (i.e., blood group O) with specific glucosidases. AcNH 
stands for an acetyl moiety covalently bound to a nitrogen atom. 
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their oligosaccharide chains and instead contain a-l,2-linked fucose, which 
is designated the H antigen. Plasma from blood group A individuals con¬ 
tains antibodies against the B antigen, blood group B individuals have 
antibodies against the A antigen, and blood group O individuals have anti¬ 
bodies against both the A and B antigens. In practice, this means that indi¬ 
viduals with either anti-A or anti-B antibodies cannot safely receive a blood 
transfusion containing the incompatible antigen, since this is likely to cause 
a severe immune response (Table 10.2). As a consequence, blood group AB 
individuals are said to be universal recipients while those from blood 
group O are universal donors. Thus, when a blood transfusion is required, 
it is advantageous to have a large supply of plasma that is from blood 
group O (e.g., in an emergency situation, there may not be sufficient time 
to check a patient's blood group). Fortunately, digestion of blood cells from 
either type A or B with specific glycosidases can cause types A, B, and AB 
to be converted into type O (Fig. 10.19). These enzymes were found fol¬ 
lowing an extensive screening process of 2,500 fungal and bacterial isolates. 
Eventually, an active a-N-acetylgalactosamidase, which converts group A 
to group O, was found in the gram-negative bacterium Elizabethkingia 
meningosepticum and one with a-galactosidase A, which converts group B 
to group O, was found in Bacteroides frngilis (also a gram-negative bacte¬ 
rium). The genes were isolated, and the proteins were characterized. Both 
of the enzymes have high specificity for cleaving the appropriate monosac¬ 
charide under conditions that maintain the integrity and functioning of the 
treated red blood cells. Moreover, each enzyme could readily be removed 
from the treated red blood cells following treatment. While this is a very 
recent and still preliminary experiment, if this novel approach works effec¬ 
tively in a clinical setting, then it should become a boon for all types of 
blood transfusions. 


Lactic Acid Bacteria 

Lactic acid bacteria are widely used in the production and preservation of 
fermented foods, and many have been given the designation "generally 
regarded as safe" within the food industry. Many of these organisms are 
members of the indigenous microflora of the human gut and have been 
recognized for their health-promoting properties. Some strains of lactic acid 
bacteria, notably lactobacilli, are used in probiotic products. A probiotic is a 
live microorganism that is claimed to confer a health benefit by altering the 
indigenous microflora of the intestinal tract. Lactic acid bacteria have also 
been used to treat several gastrointestinal disorders, including lactose intol¬ 
erance, traveler's diarrhea, antibiotic-associated diarrhea, infections caused 


TABLE 10.2 Compatible and incompatible blood groups 


Donor blood 


Recipient blood type 


type 

A 

B 

AB 

O 

A 

Compatible 

Incompatible 

Compatible 

Incompatible 

B 

Incompatible 

Compatible 

Compatible 

Incompatible 

AB 

Incompatible 

Incompatible 

Compatible 

Incompatible 

O 

Compatible 

Compatible 

Compatible 

Compatible 


Individuals from one blood group may safely receive a blood transfusion from individuals from a 
compatible blood group but not from someone from an incompatible blood group. 
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by various bacterial and viral pathogens, and immunopathological disor¬ 
ders, such as Crohn disease and ulcerative colitis. 

In the past few years, lactic acid bacteria have been used as a host 
system to express various foreign genes with the idea that these bacteria 
facilitate the delivery of the proteins encoded by the genes to the human 
gut. In particular, Lactococcus lactis has been developed as a host for this 
purpose. L. lactis is a nonpathogenic, noninvasive, noncolonizing gram¬ 
positive bacterium that is often used in the production of fermented foods. 
Moreover, L. lactis has been used for many years as a human probiotic. 

Interleukin-10 

Ulcerative colitis and Crohn disease, both diseases of the intestinal tract, 
affect approximately 1 in every 500 to 1,000 people in the developed coun¬ 
tries of the world. Ulcerative colitis is associated with excess type 2 T helper 
cell cytokines, including interleukin-4 and interleukin-5, whereas in Crohn 
disease, type 1 T helper cell cytokines, including TNF-a, IFN-a, and inter- 
leukin-2, are overproduced. The treatment for Crohn disease often includes 
trying to lower the levels of cytokines, especially TNF-a. One approach has 
been the administration of antibodies against TNF-a. Other workers have 
targeted interleukin-10 as a means of controlling Crohn disease because it 
modulates the regulatory T cells that control inflammatory responses to 
intestinal antigens. However, interleukin-10 is not clinically acceptable 
because it needs to be administered by either frequent injections or rectal 
enemas. To overcome this problem, the bacterium L. lactis was engineered 
to synthesize and secrete interleukin-10. 

Experiments were performed with mice to test whether interleukin-10- 
secreting L. lactis could be used to treat inflammatory bowel disease (Fig. 
10.20). First, interleukin-10-secreting L. lactis was fed to mice with ulcer¬ 
ative colitis that had been induced by 5% dextran sulfate in their drinking 
water. Second, strains of mice that are genetically incapable of synthesizing 
interleukin-10 and provide an animal model for ulcerative colitis were 
tested. In both of these cases, the engineered L. lactis significantly alleviated 


FIGURE 10.20 Schematic representation of the effects of intestinal interleukin-10 
(IL-lO)-secreting bacteria on inflammatory bowel disease in mice. 
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the symptoms of the disease, establishing that this approach works in prin¬ 
ciple. However, these mouse models for inflammatory bowel disease are 
not identical to the disease in humans, and a large number of questions 
remain before the treatment is used with humans. 

One concern about the use of an interleukin-10-secreting L. lactis strain 
as a therapeutic approach is the possibility that the genetically modified 
bacterium will be released to the environment. If this were to happen, the 
plasmid carrying the interleukin-10 gene and any plasmid-borne antibiotic 
resistance marker genes could be spread to other bacteria in the environ¬ 
ment. To prevent this from occurring, a synthetic human interleukin-10 
gene that replaced the L. lactis thymidylate synthase gene, thy A, which is 
essential for the growth of the bacterium, was inserted into the bacterial 
chromosome of L. lactis by homologous recombination (Fig. 10.21). This 
strain produced interleukin-10 and grew well in the laboratory when either 
thymidine or thymine was added to the medium. However, when it was 
deprived of thymidine and thymine, the viability of the bacterium declined 
by several orders of magnitude. When this modified bacterium was tested 
in pigs, whose digestive tract is similar to that of humans, it thrived and 
actively produced interleukin-10. In addition, laboratory experiments dem¬ 
onstrated that the modified L. lactis was extremely unlikely to acquire a 
thymidylate synthase gene from other bacteria in the environment, con¬ 
firming both the safety and efficacy of this approach. 

Recently, clinical trials with this L. lactis strain were initiated. To date, 
10 patients with Crohn disease have been treated. So far, a significant 
decrease in disease activity has been observed, with only minor adverse 
events. Moreover, bacteria isolated from the patients' feces were not able to 
grow without the addition of thymidine. In other words, the engineered L. 
lactis did not acquire a thymidylate synthase gene, indicating that the con¬ 
tainment strategy was effective. Thus, initial indications are that this 
strategy appears to be working as well in humans as it did with small ani¬ 
mals. 

Leptin 

It has been estimated that approximately 30% of the North American and 
20% of the European populations are overweight. Moreover, North 
Americans annually spend tens of billions of dollars on various weight 
reduction schemes, most of which are unsuccessful. However, real weight 
reduction may be obtained by administration of the protein leptin. Leptin, 
the product of the obese (ob) gene, is a 167-amino-acid protein with a molec¬ 
ular mass of approximately 16 kDa. Leptin is synthesized as a precursor 
with a 21-amino-acid-long signal peptide that is removed when leptin is 
secreted. Treatment with recombinant leptin can reduce food intake and 


FIGURE 10.21 The genetic construct integrated into the chromosomal DNA of L. lactis 
in place of its thymidylate synthase gene. The promoter (p thyA ) is from the thymidy¬ 
late synthase gene. The interleukin-10 gene was chemically synthesized so that its 
codon usage was optimized for L. lactis, thereby ensuring a high level of protein 
expression. 
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correct metabolic perturbations in (homozygous) leptin-deficient mice. 
Leptin also helps to overcome human congenital leptin deficiency. However, 
when it is introduced subcutaneously, leptin is not particularly effective in 
obese patients unless their serum leptin concentrations reach levels 20- to 
30-fold higher than normal. This response has been attributed to the inef¬ 
ficient transport of leptin across the blood-brain barrier. To overcome this 
problem, a scheme for the intranasal delivery of leptin has been devised. 

When leptin is produced in E. coli, it typically forms insoluble inclusion 
bodies that must be solubilized and renatured before the active protein is 
generated. This is a time-consuming, inefficient, and expensive process. In 
one study, the 462-bp cDNA for human leptin without its signal peptide 
was cloned and expressed under the control of the nisin promoter in L. 
lactis (Fig. 10.22). Nisin is a 34-amino-acid-residue polycyclic peptide that 
has antibacterial activity and is used as a food preservative. In L. lactis, 
leptin was produced efficiently without the formation of an inclusion body 
and was secreted from the recombinant bacteria. Intranasal administration 
of the leptin-producing L. lactis strain significantly reduced food intake and 
body weight in obese mice. This approach opens up the possibility that, if 
delivered properly, leptin might act as an effective weight loss treatment in 
humans. 

An HIV Inhibitor 

Worldwide, the predominant mode of HIV transmission is by heterosexual 
contact. One possible way to protect women, who currently comprise 
about half of all new cases of HIV /AIDS, against HIV infection is a topical 
microbicide, delivered by a live vaginal Lactobacillus strain, that prevents 
HIV infection directly at mucosal surfaces. This strategy seems reasonable 
because naturally occurring vaginal Lactobacillus strains play a protective 
role in preventing urogenital infections. 

The compound cyanovirin N, isolated from the cyanobacterium Nostoc 
ellipsosporum, blocks several steps of HIV infection, preventing virus entry 
into human cells. Consequently, cyanovirin N is a candidate for a topical 
microbiocide to prevent HIV infections. To ensure that cyanovirin N would 
be expressed at a sufficiently high level in a vaginal strain of Lactobacillus 
jensenii, the gene was chemically synthesized to reflect the codon usage 
found in the bacterium. Typically, the GC content of lactobacilli is about 
36%. In addition, during the synthesis of the gene, proline 51 was replaced 
by a glycine residue to stabilize the cyanovirin N, and four amino acids 
were added to the N terminus to ensure proper cleavage of the signal 
sequence (Fig. 10.23). The modified cyanovirin N gene was fused to a 
strong and constitutive Lactobacillus promoter. The final construct was inte¬ 
grated into the chromosomal DNA of a strain of L. jensenii and, when it was 
tested for efficacy, was found to be highly effective at preventing HIV infec- 


FIGURE 10.22 Genetic construct used to secrete leptin from L. lactis. 
Nisin 

promoter Signal peptide Leptin cDNA 







Protein Therapeutics 


399 



FIGURE 10.23 Flowchart of the scheme used to develop a Lactobacillus strain that 
produces and secretes cyanovirin N (CV-N). 


tions in mice. Under these conditions, about 4 pg of cyanovirin N per mL 
was released into the culture medium. 


Monoclonal Antibodies 

About 100 years ago, horses were inoculated with the bacterium 
Corynebacterium diphtheriae, which causes diphtheria in humans. The 
resulting crude horse antiserum was used to treat this often fatal childhood 
disease. In those days, mortality sometimes reached 45%. C. diphtheriae 
infects the throat or tonsils and produces an exotoxin that is lethal to 
human cells. This exotoxin enters the bloodstream and damages organs 
that are distant from the primary site of infection. The administration of 
horse antiserum containing antibodies against the exotoxin provided pas¬ 
sive immunity, protecting the patient from a fatal outcome when the anti¬ 
serum was given within the first few days after the onset of infection. 

Unfortunately, this kind of antibody therapy carries considerable risk 
and is not widely used today. Patients often develop antibodies against the 
foreign proteins of either whole or partially purified horse antiserum. After 
a second treatment, the sensitized patient may go into anaphylactic shock 
and die. As a result, the use of antibodies as therapeutic agents was consid¬ 
ered too dangerous for patients and was used only rarely. 

However, with the development of hybridoma technology, antibodies 
are once again seen as potential therapeutic agents. One reason for the 
renewed interest in therapeutic antibodies is that it is now possible to engi- 
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TABLE 10.3 Some therapeutic monoclonal antibodies that have been approved for human use in either the United 
States or the European Union 


Approval 

date 

Antibody 

Drug name 

Antibody type 

Therapeutic use 

1986 

Muromomab 

Orthoclone 

Murine 

Prevention of acute kidney transplant rejection 

1994 

Abciximab 

ReoPro 

Chimeric 

Prevention of blood clots 

1997 

Daclizumab 

Zenapax 

Humanized 

Prevention of acute kidney transplant rejection 

1998 

Rituximab 

Rituxan 

Chimeric 

Treatment of non-Hodgkin lymphoma 

1998 

Infliximab 

Remicade 

Chimeric 

Treatment of Crohn disease, psoriasis, 
rheumatoid arthritis 

1998 

Basiliximab 

Simulect 

Chimeric 

Preventation of transplantation 
rejection 

1998 

Palivizumab 

Synagis 

Humanized 

Treatment of viral infections in 
children 

1998 

Trastuzumab 

Herceptin 

Humanized 

Treatment of metastatic breast cancer 

2000 

Gemtuzumab 

Mylotarg 

Humanized 

Treatment of acute myeloid leukemia 

2001 

Alemtuzumab 

Leukosite 

Humanized 

Treatment of chronic lymphocytic leukemia 

2002 

Adalimumab 

Humira 

Human 

Treatment of rheumatoid arthritis 

2002 

Ibritumomab 

Zevalin 

Chimeric 

Treatment of non-Hodgkin lymphoma 

2003 

Efalzumab 

Raptiva 

Humanized 

Treatment of severe plaque psoriasis 

2003 

Omalizumab 

Xolair 

Humanized 

Treatment of severe persistent asthma 

2003 

Tositumomab 

Bexxar 

Murine + iodine-131 

Treatment of non-Hodgkin lymphoma 

2004 

Cetuximab 

Erbitux 

Chimeric 

Treatment of various cancers 

2004 

Natalizumab 

Tysabri 

Humanized 

Treatment of multiple sclerosis 

2004 

Bevacizumab 

Avastin 

Humanized 

Treatment of various cancers 

2006 

Panitumumab 

Vectibix 

Human 

Treatment of colorectal cancer 

2009? 

Denosumab 


Human 

Treatment of osteoporosis 


In addition to the monoclonal antibodies listed here, a number of monoclonal antibodies have been approved for diagnostic and imaging pur¬ 
poses. Phase III clinical trials of denosumab were successfully completed in 2008, and in early 2009 the manufacturer applied for FDA approval of 
denosumab. 


neer antibodies with a greatly reduced level of immunogenicity in humans. 
In addition, this technique can be used to maintain a continuous supply of 
pure monospecific antibody. However, the problems of cross-reactivity 
leading to an immune response and anaphylaxis have not been completely 
overcome. Thus, the recipient might still produce antibodies to a mono¬ 
clonal antibody that carries mouse (murine) determinants. To avoid this 
problem, human monoclonal antibodies with both specific immunothera- 
peutic properties and lowered potential for immunogenicity have been 
produced. In fact, a number of monoclonal antibodies have been approved 
for treating human diseases (Table 10.3, Box 10.2, and Box 10.3). 

Structure and Function of Antibodies 

Am antibody molecule (immunoglobulin) consists of two identical light (L) 
protein chains and two identical heavy (H) protein chains held together by 
both hydrogen bonding and precisely localized disulfide linkages. The 
N-terminal regions of the L and H chains together form the antigen recog¬ 
nition site of each antibody. Antibody genes can be readily manipulated 
because the various functions of an antibody molecule are confined to dis¬ 
crete domains (regions) (Fig. 10.24). The sites that recognize and bind anti¬ 
gens consist of three complementarity-determining regions (CDRs) that lie 
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BOX 10.2 


Trastuzumab: the First 
Flumanized Monoclonal 
Antibody Approved for the 
Treatment of Breast Cancer 

I n 25 to 30% of women with aggres¬ 
sive metastatic breast cancer, there is 
a genetic alteration in the HER2 gene 
that results in the production of an 
increased amount of human epidermal 
growth factor receptor 2 (HER2) pro¬ 
tein on the surface of the tumor. 
Overexpression of the HER2 protein 
can readily be determined by using an 
immunohistochemistry-based assay. 
Some years ago, researchers at 
Genentech isolated a mouse mono¬ 
clonal antibody with high affinity for 
the HER2 protein and then (using pro¬ 
cedures similar to those described in 
this chapter) humanized it. The 
humanized anti-HER2 monoclonal 
antibody, trastuzumab (Herceptin), 
contains human FRs and mouse CDRs 
and is produced commercially using 
mammalian (CHO) cells grown in sus¬ 


pension culture as the host for the 
expression of the antibody. Antibodies 
produced in CHO cells are glycosy¬ 
lated similarly to bona fide human 
antibodies. After humanization, tras¬ 
tuzumab bound to the HER2 protein 
with a dissociation constant of approx¬ 
imately 5 x 10 _s M, indicating that the 
high level of specificity for the sub¬ 
strate had been maintained through 
the process of humanization. 

In the laboratory, and then in initial 
clinical trials with more than 800 
patients, trastuzumab mediated anti- 
body-dependent cellular cytotoxicity 
(i.e., it told the immune system to 
target the cancerous cells) and inhib¬ 
ited the proliferation of human tumor 
cells that overexpressed HER2 (i.e., it 
stopped the cancerous cells from 
growing). Trastuzumab was most 
effective when it was administered 
together with some of the chemicals 
that are currently used for the treat¬ 
ment (chemotherapy) of breast cancer, 
provided that the breast cancer was at 
a later stage of development. In two 


large clinical trials that included over 
3,700 women, those who received tras¬ 
tuzumab and chemotherapy had a 
52% higher chance that the cancer 
would not return than those who were 
treated with chemotherapy alone. 
Trastuzumab is provided by the man¬ 
ufacturer as a sterile white to pale 
yellow powder containing 440 mg per 
vial, and after reconstitution, it is typi¬ 
cally administered intravenously over 
a period of 30 minutes and is taken 
weekly for 52 weeks. Since a small 
number of individuals treated with 
trastuzumab develop heart problems, 
it is necessary to carefully monitor the 
cardiac functions of all patients on this 
therapy, especially older patients and 
those with a family history of heart 
problems. In the relatively short time 
that it has been available, trastuzumab 
has become a blockbuster drug, with 
annual sales above $1 billion. In 2006, 
in the United States, trastuzumab 
treatment for one individual cost 
approximately $40,000 for the year. 


within the variable (V H and V L ) regions at the N-terminal ends of the two 
H and two L chains. The CDRs are the part of an antibody molecule with 
the greatest variability in amino acid sequence. In addition to the variable 
regions, each L chain contains one constant region, or domain (C L ), and 
each H chain has three constant regions, or domains (C H1 , C H2 , and C H3 ). 
When antibodies are digested with the proteolytic enzyme papain, three 
fragments are released: two identical (Fab) fragments, each of which con¬ 
tains an intact L chain linked by a disulfide bond to the C H1 region of the H 
chain, and one Fc fragment, which consists of two FI chain fragments, each 
containing the C H2 and C H3 domains and joined by a disulfide bond (Fig. 
10.24). The Fab fragment retains the antigen-binding activity. In fact, the 
N-terminal half of the Fab fragment, which is called the Fv fragment, con¬ 
tains all of the antigen-binding activity of the intact antibody molecule (Fig. 
10.24). The amino acid sequence of this portion of the antibody varies con¬ 
siderably from one molecule to another. Each of the constant and variable 
regions consists of approximately 110 amino acid residues. A complete 
antibody molecule has a molecular mass of approximately 150 kDa, a Fab 
fragment is around 50 kDa, and an Fv fragment is about 25 kDa. 

In an intact antibody molecule, the Fc portion elicits several immuno¬ 
logical responses after antigen-antibody binding occurs. 

• The complement cascade is activated. The components of this 
system break down cell membranes, activate phagocytes, and gen- 
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erate signals to mobilize other components of the immunological 
response system. 

• Antibody-dependent cell-mediated cytotoxicity (ADCC), which is 
the result of the binding of the Fc portion of the antibody to an Fc 
receptor of an ADCC effector cell, is produced. The bound effector 
cell releases substances that lyse the foreign cell to which the Fab 
portion of the antibody molecule is bound. 

• After the Fab region binds to a soluble antigen, the Fc portion of an 
antibody can be bound to Fc receptors of phagocytic cells, which 
engulf and destroy the antibody-antigen complex. 

Preventing Rejection of Transplanted Organs 

In the 1970s, passive immunization was reconsidered as a way of pre¬ 
venting immunological rejection of a transplanted organ. The rationale was 
to administer to patients a specific antibody that would bind to certain 
lymphocytes and diminish the immune response directed against the 
transplanted organ. The mouse monoclonal antibody OKT3 was approved 


BOX 10.3 


Rituximab and 
Ibritumomab: Therapeutic 
Monoclonal Antibodies 
That Treat Non-Hodgkin 
Lymphoma 

N on-FIodgkin lymphoma is a 

malignant growth of B or T cells 
of the lymph system. It has been esti¬ 
mated by the American Cancer Society 
that in 2007 alone approximately 
63,000 new cases of non-Hodgkin lym¬ 
phoma were diagnosed, resulting in 
approximately 19,000 deaths. In fact, 
about 5 million people worldwide 
have non-Hodgkin lymphoma, 5 to 
10% of these people die every year, 
and the incidence of the disease is 
growing. It is the fifth most common 
cancer (although there are about 29 
different lymphomas in this category), 
with an individual's chance of devel¬ 
oping the disease in their lifetime 
being about 1 in 50. 

There are a variety of treatments 
for patients with non-Hodgkin lym¬ 
phoma, including radiation therapy, 
chemotherapy, immunotherapy, bone 
marrow transplantation, and "watch 
and wait" for slowly growing cases. In 
1997, the FDA approved the use of 
rituximab (Rituxan) for the treatment 


of non-Hodgkin lymphoma. 

Rituximab is a genetically engineered 
chimeric (murine/human) monoclonal 
antibody directed against the CD20 
antigen (a protein on the surfaces of B 
lymphocytes). Following binding of 
the antibody to CD20, the body's 
defenses attack and kill the antibody- 
marked B cells. Stem cells in bone 
marrow lack CD20, so they are unin¬ 
hibited by this treatment. Healthy B 
cells can regenerate from those stem 
cells, after the completion of the 
course of rituximab treatment (given 
once a week for 4 to 8 weeks), and 
return to normal levels within several 
months. In 2006, the FDA approved 
the use of rituximab in combination 
with CHOP (cyclophosphamide, dox¬ 
orubicin, vincristine, and prednisone) 
and other anthracycline-based chemo¬ 
therapy regimens. In addition, the use 
of rituximab in combination with the 
chemical compound methotrexate was 
approved for the treatment of moder¬ 
ately to severely active rheumatoid 
arthritis in patients who had been 
refractory to other treatments. 

Notwithstanding some severe side 
effects in some patients, rituximab has 
been enormously successful. 

Hundreds of thousands of people 
worldwide who did not respond well 


to conventional chemotherapy have 
been successfully treated with ritux¬ 
imab. In fact, while the incidence of 
non-Hodgkin lymphoma continues to 
increase, since the introduction of 
rituximab, mortality from the disease 
in the United States has declined at a 
rate of approximately 2.3% a year. In 
2002, the FDA approved the use of 
ibritumomab tiuxetan (Zevalin) 
together with rituximab. Ibritumomab 
is also a monoclonal antibody that tar¬ 
gets B cells. However, ibritumomab is 
linked to a chemical chelator molecule 
(tiuxetan) that binds tightly to radio¬ 
active indium-111 or yttrium-90. Thus, 
a therapeutic regimen with ibritu¬ 
momab tiuxetan targets tumor cells 
with a high dose of radiation. In late 
2007, treatment with ibritumomab 
tiuxetan was priced at approximately 
$24,000 per month, with treatments 
typically lasting 1 or 2 months. 
Treatment with ibritumomab tiuxetan 
is quite toxic, and around half of the 
treated individuals experience side 
effects. Therefore, ibritumomab tiux¬ 
etan is approved only for patients who 
have failed to respond to other 
treatments. 
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FIGURE 10.24 Structure of an antibody molecule. The H and L chains contain vari¬ 
able regions (V L and V H ) with their CDRs (CDR1, CDR2, and CDR3) and constant 
domains (C L , C H1/ C H2 , and C H3 ). The Fv, Fab, and Fc portions of an antibody mole¬ 
cule are delineated. The N-terminal (NH 2 ) and C-terminal (COOH) ends of each 
polypeptide chain are indicated. 


in 1986 by the FDA for use as an immunosuppressive agent after organ 
transplantation in humans (Table 10.3). Lymphocytes that differentiate in 
the thymus are called T cells. Various members of the T-cell population act 
as immunological helper and effector cells and are responsible for organ 
rejection. The OKT3 monoclonal antibody binds to a cell surface receptor 
called CD3, which is present on all T cells. As a result, a full immunological 
response is blocked, and the transplanted organ is not rejected. 
Immunosuppression by this means was reasonably effective, although as 
anticipated, because the antibody was from a mouse, there were some side 
effects, including fever and rash formation. 


Recombinant Antibodies 

Hybrid Human-Mouse Monoclonal Antibodies 

The modular nature of antibody functions has made it possible to convert 
a mouse monoclonal antibody into one that has some human segments but 
still retains its original antigen-binding specificity. This hybrid molecule is 
called a chimeric antibody (Fig. 10.25), or, with more human sequences, a 
"humanized" antibody (Fig. 10.26). The difference between a chimeric and 
a humanized mouse monoclonal antibody depends on which portions of 
the mouse antibody are removed. The first portion of a mouse monoclonal 
antibody that was targeted for replacement with a human sequence was 
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FIGURE 10.25 Genetically engineered chimeric antibody. The V L and V H DNA regions 
from the immunoglobulin L and H genes that encode part of a mouse monoclonal 
antibody were substituted for the V L and V H DNA regions of a human immuno¬ 
globulin molecule. The product of the constructed gene is a chimeric (partially 
humanized) immunoglobulin with the antigen-binding specificity of the mouse 
monoclonal antibody and both lowered immunogenicity in humans and human Fc 
effector capabilities. 


the mouse Fc fragment. The mouse Fc fragment was chosen because it 
functions poorly as an effector of immunological responses in humans. It is 
also the most likely fragment to elicit the production of human antibodies. 
To diminish immunogenicity and to introduce human Fc effector capabili¬ 
ties, the DNA coding sequences for the Fv regions of both the L and the Ft 
chains of a human immunoglobulin were substituted for the Fv DNA 
sequences for the L and H chains from a specific mouse monoclonal anti¬ 
body (Fig. 10.25). This replacement of Fv coding regions can be accom¬ 
plished by using either oligonucleotides with in vitro DNA replication or 
cloned DNA segments. The DNA constructs for both chimeric chains were 
cloned into an expression vector and transfected into cultured B lympho¬ 
cytes, from which the chimeric antibody was collected. Chimeric antibodies 
are composed of approximately 70% human and 30% mouse DNA 
sequences. 

When a chimeric antibody that contained the binding site from a 
mouse monoclonal antibody directed against the surfaces of human colon 
cancer cells was tested in patients with colorectal cancer, it remained in the 
blood system about six times longer than the complete mouse monoclonal 
antibody, thereby extending the period of effectiveness. Only 1 patient of 
the 10 developed a mild immunological reaction to the chimeric antibody. 
Flowever, in this trial, no antitumor effects were observed, an outcome that 
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FIGURE 10.26 Genetically engineered humanized antibody. The CDRs (CDR1, CDR2, 
and CDR3) from the genes for H and L immunoglobulin chains of a mouse mono¬ 
clonal antibody replace the CDRs of the genes for a human antibody. The product 
of this constructed gene is an immunoglobulin with the antigen-binding specificity 
of the mouse monoclonal antibody and all the other properties of a human antibody 
molecule. 


may be due to the low dose levels or to the advanced stage of the cancer in 
the subjects. 

The "humanizing" of mouse and rat monoclonal antibodies has been 
taken one step further than the formation of chimeric molecules by substi¬ 
tuting into human antibodies only the CDRs of the rodent monoclonal 
antibodies (Fig. 10.26). Humanized antibodies consist of approximately 
95% human and 5% mouse DNA sequences. Because these engineered 
human antibodies have antigen-binding affinities similar to those of the 
original rodent monoclonal antibodies, they may be more effective thera¬ 
peutic agents. 

The humanizing of rodent monoclonal antibodies may be performed as 
follows. Starting with a rodent hybridoma cell line, cDNAs for the L and H 
chains are isolated. The variable regions of these cDNAs are amplified by 
PCR. The oligonucleotide primers that are used for this amplification are 
complementary to the sequences at the 5' and 3' ends of the DNA encoding 
the variable regions. From the nucleotide sequences of the cDNAs for the L 
and H regions (V L and V H ), it is possible to delineate the limits of the CDRs. 
It is usually straightforward to determine where the CDRs begin and end, 
because these regions are highly variable in sequence while the sequences 
of the framework regions (FRs) are relatively conserved. On the basis of the 
sequences of the DNAs encoding the rodent CDRs, six pairs of oligonucle¬ 
otide PCR primers are synthesized. Each pair of primers is designed to 
initiate the synthesis of the DNA for one of the six rodent CDRs—three 
from the L chain and three from the H chain. In addition, each primer 
includes an extra 12 nucleotides at its 5' end, complementary to the flanking 
regions within the human framework DNA into which the DNA for the 
rodent CDRs is targeted (Fig. 10.27). Oligonucleotide-directed mutagenesis 
is then used to replace, one at a time, the complete DNA sequence for each 
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peptide FR1 CDR1 FR2 CDR2 FR3 CDR3 FR4 



FIGURE 10.27 PCR amplification of CDR1 from a rodent monoclonal antibody L 
chain cDNA. The PCR primers PI and P2 contain oligonucleotides complementary 
to the rodent CDR1 DNA. In addition, PI and P2 each contain 12 nucleotides at 
their 5' ends that are complementary to the FRs of human monoclonal L chain 
cDNAs. Using six separate pairs of oligonucleotide primers—three for the V L region 
and three for the V H region—each of the rodent CDRs is separately amplified by 
PCR. Then, by PCR, the amplified rodent CDRs are spliced into human antibody 
genes in place of the resident CDRs. This grafting is made possible by the presence 
of DNA complementary to the human FRs on the amplified rodent CDR DNAs. 


of the human CDRs with the amplified DNA for the rodent CDRs. Thus, it 
is necessary to carry out six cycles of oligonucleotide-directed mutagenesis, 
one cycle to replace each CDR. This procedure, in effect, "grafts" the rodent 
CDRs onto the human antibody framework. The humanized variable- 
region cDNAs are then cloned into expression vectors, which are then 
introduced into appropriate host cells, usually either E. coli or mammalian 
cells, for the production of antibodies. 

To date, more than 50 different monoclonal antibodies have been 
humanized. While this technology is clearly effective and widely appli¬ 
cable, it is nevertheless time-consuming and expensive. Probably (as 
described below) in the future other strategies will be used to produce 
human antibodies and antibody fragments, such as (1) phage display com¬ 
binatorial libraries that are constructed from mRNA from human B cells 
from nonimmunized donors and (2) transgenic mice that express the entire 
human antibody repertoire. 

Human Monoclonal Antibodies 

Although most of the immunotherapeutic agents that have been developed 
have been effective, there are drawbacks to the use of monoclonal anti¬ 
bodies that contain nonhuman sequences. For example, if multiple treat¬ 
ments are required, which is often the case, it is desirable that the antibody 
contain no or only a very limited amount of nonhuman sequences to pre¬ 
vent immunological cross-reactivity and sensitization of the patient. 
Unfortunately, it is very difficult to create human monoclonal antibodies 
for a number of reasons. The human chromosomes of fused human lym¬ 
phocyte-mouse myeloma cells during hybridoma formation are unstable, 
and cells that produce a human monoclonal antibody are rarely formed. To 
date, no human myeloma cell line has been discovered that can replace the 
mouse myeloma cell line in this procedure. Even if it were possible to form 
human hybridoma cell lines, it is contrary to accepted norms of medical 
research to inject humans with a specific antigen for nontherapeutic pur- 
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Introduce human IgL loci and 
Inactivate mouse IgH IgH locus into a cell that has 

and IgK chain genes mouse IgH and IgK genes 



FIGURE 10.28 Generation of a XenoMouse. Mouse antibody genes are inactivated by 
specific deletions in embryonic stem cells, which are subsequently used to generate 
transgenic mice unable to make antibodies. The human genes encoding immuno¬ 
globulin light and heavy chains are introduced on a YAC into mouse embryonic 
stem cells. These cells are used to generate transgenic mice able to synthesize both 
mouse and human antibodies. The mice generated from these two types of manip¬ 
ulation are cross-bred, and mice that can synthesize only human immunoglobulins 
are selected, immunized, and used to make hybridomas producing human anti¬ 
bodies. 


poses and to perform a partial splenectomy to collect antibody-producing 
cells. Therefore, it has been necessary to devise other approaches for 
obtaining human monoclonal antibodies. 

To address this need, researchers constructed a "XenoMouse" in which 
(1) the mouse antibody production machinery was inactivated and (2) all 
of the human immunoglobulin loci (both light and heavy chains) are inte¬ 
grated into a mouse chromosome (Fig. 10.28). The human heavy chain 
genes and the human k and X light chain genes (where k and X are different 
classes of light chain genes) were cloned onto yeast artificial chromosomes 
(YACs) that can carry very large amounts of foreign DNA. The YACs with 
the human immunoglobulin genes were then introduced into mouse 
embryonic stem cells by fusing YAC-containing yeast spheroplasts with the 
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Intact IgG Bivalent diabody Bispecific diabody 

FIGURE 10.29 Schematic representation of active antibodies and antibody fragments. 


embryonic stem cells. This procedure yields a large number of embryonic 
stem cells in which all of the introduced human immunoglobulin genes 
have become stably integrated into the chromosomal DNA. These trans¬ 
fected cells were used to generate mice containing human immunoglobulin 
gene loci. Cross-breeding of two mouse lines, one carrying both mouse and 
human immunoglobulin genes and the other carrying the deleted mouse 
immunoglobulin genes, produced a mouse strain (called XenoMouse) that 
expresses only human immunoglobulins. It is now possible, after immuni¬ 
zation of a XenoMouse with a particular antigen, to produce a fully human 
immunoglobulin. The large human antibody repertoire in the XenoMouse 
has enabled researchers to produce a number of fully human antibodies, 
many of which are currently at various stages of clinical development. For 
example, the first fully human monoclonal antibody produced using this 
technology is panitumumab, which was approved in September 2006 for 
the treatment of certain forms of colorectal cancer (Table 10.3). 

Antibody Fragments 

Naturally occurring antibodies are highly specific targeting reagents that 
provide animals with a powerful means of defending themselves against a 
wide range of pathogenic organisms and toxins. Immunoglobulin G (IgG) 
is the main antibody found in mammalian serum, and it is the native form 
that is almost exclusively used in therapeutic antibodies (Fig. 10.29). The 
fact that IgG molecules have two identical sites that bind to two identical 
antigens (i.e., they are bivalent) generally increases their effectiveness in 
vivo. While the Fc portion of the IgG molecule is important in recruiting 
cytotoxic effector functions through complement or interaction with spe¬ 
cific receptors, Fc-mediated effects are not necessary for all applications 
and may even sometimes be undesirable. Over the past several years, by 
manipulating portions of the IgG light and heavy chain cDNAs, researchers 
have constructed a variety of IgG derivatives or fragments that may be 
used instead of whole antibody molecules (Fig. 10.29). Some of these mol¬ 
ecules, because of their small size, bind more efficiently to targets that are 
inaccessible to conventional whole antibodies. Others have multiple sites 
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for binding to the same antigen, while others have binding specificities for 
two or more target antigens. 

Initially, based on Fab and Fv fragments of antibodies, antigen-binding 
single protein chains (scFv) consisting of only V L and V H domains were 
developed. Single-chain antibodies may be used for a variety of therapeutic 
and diagnostic applications in which Fc effector functions are not required 
and when small size is an advantage. Single-chain antibodies have a molec¬ 
ular mass of approximately 27 kDa compared with approximately 150 kDa 
for IgG molecules. Because of their small size, single-chain antibodies can 
penetrate and distribute in large tumors more readily than intact anti¬ 
bodies. In addition, a protein-coding sequence can be linked to a single¬ 
chain antibody sequence to create a dual-function molecule that can both 
bind to a specific target and deliver a toxin or some other specific activity 
to a cell (Fig. 10.30). 

Computer simulations of the three-dimensional structure of a potential 
single-chain antibody showed that the V L and V H domains have to be sepa¬ 
rated by a linker peptide to assume the correct conformation for antigen 
binding. On the basis of this design constraint, DNA constructs of V L and 
V H sequences from a cDNA of a cloned monoclonal antibody were each 
ligated to a chemically synthesized DNA linker fragment in the order 
V L -linker-V H . After expression in E. coli, the single-chain protein was puri¬ 
fied, and both its affinity and specificity were found to be equivalent to 
those of the original intact monoclonal antibody. Moreover, instead of 
linking the V H and V L chains with a short peptide, amino acids in the FR 
can be modified to form a disulfide linkage between the two peptides (Fig. 
10.30B). The effectiveness of this disulfide-stabilized Fv molecule (V L -S- 
S-V H ) coupled to a cancer cell toxin was compared with that of an scFv 
molecule coupled to the same toxin. The disulfide-stabilized and scFv 
immunotoxins had the same activity and specificity. However, the former 
molecule was severalfold more stable than the latter. This suggests that 
disulfide-stabilized Fv molecules may be more useful than scFv molecules 
in some therapeutic applications. The two types of molecules have been 
used in different ways. For example, by altering the number of amino acids 
in the linker in an scFv molecule (usually to five or fewer amino acids) it is 
possible to direct the self-assembly of these molecules into either bivalent 
dimers, called diabodies (Fig. 10.29), trimers (triabodies), or tetramers (tet- 
rabodies). Shortening the linker affects not only the multimerization, but 
also the stability of the molecule, with molecules with a shorter linker 


FIGURE 10.30 Schematic representation of an 
stabilized Fv immunotoxin (B). 
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CHAPTER 10 



MILESTONE 


Synthesis in E. coli of a Polypeptide with Human 
Leukocyte Interferon Activity 

S. Nagata, H. Taira, A. Hall, L. Johnsrud, M. Streuli, J. Ecsodi, 
W. Boll, K. Cantell, and C. Weissmann 
Nature 284:316-320,1980 


T he late 1970s and early 1980s 
were a time of tremendous 
excitement in molecular biotech¬ 
nology. The promise of this new tech¬ 
nology was being touted to both the 
public and large institutional inves¬ 
tors. One of the products of the new 
biotechnology that captured the imag¬ 
ination of a large number of people 
was IFN, which at the time was seen 
by many as a possible miracle cure for 
a wide range of diseases caused by 
viruses and cancer. Thus, the isolation 
of a human IFN cDNA and its subse¬ 
quent expression in E. coli were 


reported in newspapers and maga¬ 
zines around the world. 

Several features of IFN made it par¬ 
ticularly difficult to synthesize and 
isolate a cDNA encoding the polypep¬ 
tide. First, although IFN had been 
purified more than 80,000-fold, only 
minuscule amounts were available, so 
researchers did not even have an accu¬ 
rate estimate of its molecular mass. 
Second, unlike many other proteins, 
IFN did not have a chemical or biolog¬ 
ical activity that was easy to monitor. 
At the time, its activity was measured 
by the reduction in the cytopathic 
effect of an animal virus on cells in 


culture, which was an extremely com¬ 
plex and time-consuming process. 
Third, unlike insulin, researchers had 
no idea if there was one particular 
human cell that produced high levels 
of IFN and therefore could serve as a 
source of mRNA that was enriched for 
IFN mRNA. These problems notwith¬ 
standing, a cDNA encoding IFN was 
eventually isolated and characterized. 
Since that time, researchers have dis¬ 
covered that there are several different 
types of IFNs. Unfortunately, IFN is 
not the panacea that was dreamed of 
by both investors and the press. 
However, the genes for several IFNs 
have been isolated, and clinical trials 
have shown that they are effective 
treatments for a variety of viral dis¬ 
eases. 


tending to be more stable. In addition, it is possible to combine two dif¬ 
ferent antigen specificities into a single bispecific diabody (Fig. 10.29). 

Most recombinant antibody-toxin combinations (immunotoxins) have 
been constructed using Pseudomonas exotoxin A, which is a 66-kDa protein 
with three separate domains. Domain I is responsible for cell binding, 
domain II for translocation of the protein into the cell, and domain III for 
ADP-ribosylation (Fig. 10.31A). Other protein toxins that have been used 
include bacterially derived diphtheria toxin and the plant-derived toxin 
ricin. Am immunotoxin is generally synthesized by replacing the N-terminal 
domain of the toxin, e.g.. Pseudomonas exotoxin A (domain I), with the 
single-chain antibody sequence, thereby creating molecules very similar in 
size to the original toxin with the ability to bind, enter, and kill a specific 
cell (Fig. 10.31B). A number of immunotoxins that have antitumor activity 
in vitro and in animal models have been constructed. These include anti¬ 
bodies directed against the p55 subunit of the interleukin-2 receptor, the 
transferrin receptor, carbohydrate antigens, the epidermal growth factor 
receptor, and some cancer cell surface proteins. Toxin molecules may also 
be directed to cancer cells by using a dispecific diabody that is engineered 
to bind to a surface-specific tumor-associated antigen and then to a toxin 
molecule, thereby directing the toxin molecule to the tumor (Fig. 10.32). A 
number of different engineered immunotoxins are currently in clinical 
trials. 

It may be possible to create peptides that are smaller than scFvs and 
still retain the ability to bind to a specific antigen. The rationale for devel¬ 
oping smaller antibody-toxin complexes is that they are more likely to 
penetrate a tumor and more completely stop tumor growth. Recently, a 
group of researchers constructed a short peptide (28 amino acids long) that 
retained the binding specificity of the monoclonal antibody from which it 
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was derived. It is well established that antibody-binding specificity resides 
within the six hypervariable loops called CDRs, three from the variable 
region of the heavy chain and three from the variable region of the light 
chain (Fig. 10.24 and 10.33). In ah antibody molecules, the CDRs are flanked 
by FRs. Moreover, it was speculated, at least for some antibodies, that the 
major portion of the antigen-binding site might reside primarily within two 
CDRs, one from the heavy chain and the other from the light chain. To test 
this possibility, starting with genes for the variable portion of a monoclonal 
antibody against a surface protein from Epstein-Barr virus, eight different 
peptide combinations were synthesized. Each peptide contained at least 
one CDR3 loop (known to be the major antigen-contacting segment), as 
well as one other CDR loop and a linker peptide (usually an FR spacer). A11 
eight of these peptides were tested in vitro for the ability to compete with 
the parental antibody for binding to the Epstein-Barr virus (thought to be 
the causative agent of Burkitt lymphoma and other cancers) surface pro¬ 
tein. One of the peptide combinations, V H CDR1-V H FR2-V L CDR3, appeared 
to be promising (Fig. 10.33). Next, this peptide was coupled to a toxin mol¬ 
ecule, colicin la, and the combination was tested both with cells in culture 
and with mice. In mice, the peptide-colicin adduct efficiently traveled 
through the circulation and then found and killed the tumor cells expressing 
the target antigen. Colicin la by itself does not affect these tumors to any 
significant extent. Also, the original monoclonal antibody is unable to pen¬ 
etrate into the tumor. On the other hand, the peptide-colicin adduct accu¬ 
mulated at the cores of the targeted tumors. This very exciting work is at 
an early stage of development, so a large number of issues remain to be 
addressed before it can become an effective human therapeutic measure. 
Nevertheless, the demonstration that a small peptide can mimic the 
binding specificity of an entire antibody molecule and successfully deliver 
a cellular toxin to targeted cells may provide the basis for a whole new 
approach for treating tumors. 

Combinatorial Libraries of Antibody Fragments 

Hybridoma cells, like most other animal cells in culture, grow relatively 
slowly, do not attain high cell densities, and require complex and expen¬ 
sive growth media. The cost of monoclonal antibody production is an 
impediment to their more widespread use as therapeutic agents. To cir¬ 
cumvent this problem, attempts have been made to genetically engineer 
bacteria, plants, and animals to act as "bioreactors" for the production of 


FIGURE 10.31 Domain structure of Pseudomonas exotoxin A (A) and a single-chain 
antibody-Pseudomonas exotoxin A (B). The functions of the various domains are 
shown. 
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FIGURE 10.32 Schematic representation of the binding of a diabody to a protein mol¬ 
ecule on the surface of a cancerous cell, as well as the binding of a toxin protein 
molecule to the other portion of the diabody. 


monoclonal antibodies. For effective delivery and function of some immu- 
notherapeutic agents, only the antigen-binding region of an antibody (the 
Fab or Fv fragment) is required. In other words, the Fc portion of an anti¬ 
body is dispensable for some applications. 

An elaborate series of manipulations makes it possible to select, as well 
as produce, functional antibodies in E. coli (Fig. 10.34). 

1. cDNA is synthesized from mRNA isolated from mouse antibody- 
producing cells (B lymphocytes). 

2. The FI and L chain sequences in the cDNA preparation are ampli¬ 
fied separately by PCR. 

3. Each amplified cDNA preparation is treated with a specific set of 
restriction endonucleases and cloned into a bacteriophage A, vector. 
The cDNA sequences of the H and L chains each have distinctive 
restriction endonuclease recognition sites, an arrangement that 
facilitates the directional cloning of each sequence into a separate 
bacteriophage A vector. At this stage of the process, many different 
H and L chain sequences are cloned (Fig. 10.35A and B). 

4. The cDNAs of one H and one L chain are cloned into a single "com¬ 
binatorial" vector, thereby enabling the bacteriophage to coexpress 

FIGURE 10.33 Organization of V H and V L regions of a monoclonal antibody and the 
development of a peptide, based on a portion of the CDR and FR of the V H and V L 
regions of the antibody molecule, with a similar binding specificity 
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both chains, thus forming an assembled antibody Fv fragment (Fig. 
10.35C). 

5. The FI and L chains are expressed during the lytic cycle of bacterio¬ 
phage A,, so that the library of combinatorial bacteriophage clones 
can be screened for the presence of antigen-binding activity 

The step in which L and Ft chain cDNAs are combined on one vector 
creates a vast array of diverse antibody genes, some of which encode unique 
target-binding sites whose isolation would never have been possible by 
standard hybridoma procedures. The mammalian antibody repertoire has 
the potential to produce approximately 10 6 to 10 8 different antibodies. A 
phage library contains approximately this number of clones, so one combi- 


FIGURE 10.34 Procedure to create a combinatorial library of the V L and V H regions of 
antibody chains in E. coli. Note that the H and L chains are amplified in separate 
PCRs. 
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FIGURE 10.35 DNA constructs of an Fv combinatorial gene library cloned into bacte¬ 
riophage X DNA. (A and B) Portions of the cDNAs of the L (A) and H (B) chains are 
separately cloned into bacteriophage X vectors. (C) Each of these libraries is 
digested with EcoRI, and then the fragments from the H chain library are ligated to 
the fragments from the L chain library, thereby creating a combinatorial library that 
contains all possible combinations of L and H chain fragments. p lac , the E. coli lac 
promoter; RBS, ribosome-binding site. 


natorial library can be expected to produce as many different antibodies (in 
this case, Fv molecules) as any mammal. In addition, once an initial combi¬ 
natorial library has been constructed, it is possible to shuffle the L and H 
chains to obtain Fv molecules that recognize unusual epitopes, and even 
greater variation may be achieved by random mutagenesis of the DNAs in 
the combinatorial library (see chapter 8). Because millions of bacteriophage 
plaques can be screened in a relatively short period, the identification of Fv 
molecules with the desired specificity takes only about 7 to 14 days. By con¬ 
trast, screening hybridoma cell lines is a slow, time-consuming process. 

Because they lyse bacterial host cells, bacteriophage X vectors are not 
particularly useful for the production of large quantities of protein. To 
overcome this drawback, the bacteriophage X vector was engineered so 
that the H and L chain DNA sequences were inserted into a site that was 
flanked by plasmid DNA sequences. This plasmid DNA, containing an H 
and L chain DNA combination, can be excised from the bacteriophage X 
vector and transformed into E. coli (Fig. 10.34). As part of a plasmid, large 
numbers of Fv fragments can be produced in E. coli cells. 

As an alternative to the use of bacteriophage X, filamentous bacterio¬ 
phages, such as Ml 3 and fd, have been used for the production of combi¬ 
natorial libraries (Fig. 10.36). In these cases, the antibody fragment is 
synthesized as part of a fusion protein that is located on the outer surface 
of the bacteriophage. A combinatorial library of antibody fragments dis¬ 
played on the surface of a filamentous bacteriophage can be screened by an 
enzyme-linked immunosorbent assay-like system. Briefly, samples (ali¬ 
quots) of the library are added to the wells of a multiwell plate that are 
coated with the target antigen (Fig. 10.37). The wells are rinsed thoroughly 
to remove any unbound bacteriophage. Next, an antibody that binds to the 
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FIGURE 10.36 Formation of an Fv antibody combinatorial library in the filamentous 
bacteriophage M13. Following extraction of mRNA and its conversion to cDNA, the 
cDNAs for the V L and V H regions are amplified (separately) by PCR and then ligated 
to DNA that encodes a short linker peptide. Each single-chain antibody-DNA con¬ 
struct is cloned into gene 3 of bacteriophage M13. There are three copies of the phage 
gene 3 protein, which is a phage surface protein, per M13 bacteriophage. 


bacteriophage coat protein and is conjugated with an enzyme is added to 
each well. The wells are rinsed to remove any unbound antibody-enzyme 
complex. The phage particles bound to the target antigen are recognized by 
the antibody-enzyme complex. A chromogenic substrate that is cleaved by 
the bound enzyme is then used to determine which wells contain a phage 
carrying antibodies to the target antigen. This approach is easier than using 
plaque assays with bacteriophage A to select and subsequently purify a 
bacteriophage producing an antibody fragment that binds to a specific 
antigen. Once a desired antibody fragment-producing bacteriophage has 
been isolated, using either bacteriophage A or M13, the DNA can be iso¬ 
lated and subcloned into an expression vector. These procedures are used 
to produce mouse, chimeric, or humanized antibodies. 

A Combinatorial Library of Full-Length Antibodies 

Until recently, all of the combinatorial libraries of antibodies included either 
single-chain antibodies or Fab fragments and not full-length antibodies. 
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FIGURE 10.37 Immunological screening of a bacteriophage M13 combinatorial 
library. 


However, for many applications it is advantageous for therapeutic anti¬ 
bodies to be full length. With this in mind, researchers cloned and expressed 
complete antibody molecules (using either two separate vectors, one 
encoding a light chain and the other encoding a heavy chain, or a single 
operon controlling the expression of both the light and heavy chain on a 
single vector) in E. coli. 

The process of selecting antibodies with specified affinities would be 
greatly facilitated if an E. coli library could be screened directly for binding 
to various antigens. To do this, a combinatorial library of full-length anti¬ 
bodies was generated and engineered so that the antibodies were secreted 
into the periplasm between the inner and outer membranes. In addition, 
prior to the expression of the antibody molecules, the host E. coli was engi¬ 
neered to express a fusion protein that became anchored within the inner 
membrane (lipoprotein fragment) and also contained a portion of a protein 
from the Staphylococcus aureus protein A that binds specifically to Fc regions 
(Fig. 10.38). When an IgG molecule is secreted into the periplasm, the 
fusion protein binds to the Fc region. The tightly bound IgG-fusion protein 
complex remains intact when the E. coli cells are treated with EDTA and 
lysozyme to remove a portion of the outer membrane. Then, a fluorescently 
labeled target antigen is added to detect those cells that have expressed an 
antibody directed against that target antigen. By cell sorting, fluorescently 
labeled cells are selected. Then, the plasmid DNA that encodes the selected 
IgG molecules is isolated and expressed in an £. coli strain that does not 
synthesize the membrane-anchored fusion protein. This simple, yet pow¬ 
erful, technique simplifies the selection and production in £. coli of full- 
length monoclonal antibodies. 

Shuffling CDR Sequences 

Very large libraries of single-chain antibodies can be the sources of a wide 
range of highly specific human antibodies so that one does not have to 
resort to using mice for the initial monoclonal antibody. Theoretically, such 
libraries contain a greater diversity of antibodies than is normally found in 
the human immune system. To construct a library of this sort, B cells and 
other antibody-producing cells, from different, nonimmunized individuals 
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FIGURE 10.38 Schematic representation of the selection of a full-length antibody that 
binds to a fluorescently labeled target antigen. The Fc portion of the antibody is 
bound by a fragment of S. aureus protein A fused to a portion of an E. coli lipopro¬ 
tein anchored in the inner membrane. 


and from different tissues and organs, are collected and pooled before the 
mRNA is isolated (Fig. 10.39). The isolated mRNA is used to program the 
synthesis of cDNA, which then becomes a template for the specific PCR 
amplification of each CDR (separately). The amplified CDRs are mixed 
with oligonucleotides encoding FRs, a linker, and DNA sequences encoding 
the variable L and H domains. Overlap extension PCR (see chapter 4) is 
used to order, join, and amplify H and L antibody chain genes. In addition 
to being entirely human, the 2 x 10 9 different single-chain antibodies that 
have been produced in this way have a very wide range of specificities, 
reflecting the fact that the CDR sequences have been incorporated in 
random order and from a variety of sources, i.e., they have been shuffled. 
There are single-chain antibodies against (3-galactosidase, the (3 subunit of 
cholera toxin, fluorescein isothiocyanate, human cell surface antigen, 
human leptin, human prostrate-specific antigen, and streptavidin, among 
others. Moreover, the dissociation constants (K d s) for the interaction of the 
selected antibodies with their target antigens ranged from 0.9 x 10 9 to 420 
x 10 9 M 1 (where tight binding is represented by a low number). For anti¬ 
bodies, K d s of around 0.1 x 10 8 to 10 x 10 s M 1 reflect a high degree of 
specificity for the target antigen. The fact that a very large library of single¬ 
chain antibodies, all with high affinity for their target antigens, could easily 
be generated by this procedure means that the goal of being able to select 
virtually any antibody from a nonimmunized library has been achieved. 

Chemically Linked Monoclonal Antibodies 

Drugs that are very effective when tested in cell culture are often much less 
potent in a whole organism. This difference in potency is typically due to 
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FIGURE 10.39 Construction of a large library of single-chain antibodies. B cells from 
several nonimmunized individuals were collected and pooled, the mRNA was iso¬ 
lated and used to program the synthesis of cDNA, oligonucleotide primers con¬ 
taining DNA sequences that included small portions of the FR sequence were 
added to the cDNA preparation, and all six CDRs were amplified separately by 
PCR. The amplified CDRs from all six PCRs were mixed together with oligonucle¬ 
otides encoding the FRs and the linker, and genes encoding the variable L and FI 
domains were synthesized by overlap extension PCR. One of the many possible 
single-chain antibodies that were synthesized is shown. 
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MILESTONE 


Construction of a Retrovirus Packaging Mutant and Its 
Use To Produce Helper-Free Defective Retrovirus 

R. Mann, R. C. Mulligan, and D. Baltimore 
Cell 33:153-159,1983 


uman gene therapy has 
I always been a controversial 
JL JLsubject. However, most scien¬ 
tists agreed with T. Friedmann and R. 
Roblin, who in 1972 (Science 175:949- 
955) stated, "In our view, gene therapy 
may ameliorate some human genetic 
diseases in the future. For this reason, 
we believe that research directed at 
the development of techniques for 
gene therapy should continue." The 
essential features of human gene 
therapy are the delivery of a remedial 
gene and its expression in a cell type 
or tissue that cures a disease without 
risk to either those administering the 
therapeutic agent or those receiving it. 
Initially, vectors derived from human 
viruses seemed likely to be the prin¬ 
cipal mode for delivering remedial 
genes because they have specialized 
mechanisms for entering specific cells. 
In particular, of the various potential 
viral vectors, those based on retrovi¬ 


ruses were considered the most prom¬ 
ising. However, a native retrovirus is 
an infectious agent that can cause cell 
damage and, in some instances, 
induce cancer. The most significant 
advance that made human gene 
therapy possible was the development 
of a system for packaging a remedial 
gene into a noninfectious virus par¬ 
ticle that retains the capability of 
attaching to its host cell. 

Mann et al. constructed the first 
retroviral packaging cell line. In 
essence, they integrated a viral 
genome from which they had 
removed a DNA segment that con¬ 
tained the packaging signal into a 
chromosome of a cell line. Under 
these conditions, the cell line pro¬ 
duced noninfectious virus particles. 
However, after transfection of these 
cells with a DNA construct that had a 
packaging signal and a remedial gene 
but no retroviral genes, the construct 


was packaged into virus particles that 
in turn could be used to deliver a 
remedial gene to a particular cell 
type. This clever strategy was imme¬ 
diately adopted by many other 
researchers who were working on 
human gene therapy. Over the years, 
the original retrovirus cell-packaging 
line has been enhanced, and the con¬ 
cept has been successfully applied to 
other viral vectors. 

In 1985, as a result of the work of 
Mann et al. and others, W. F. 
Anderson, who has been a persistent 
advocate for human gene therapy, 
noted, "It now appears that effective 
delivery-expression systems are 
becoming available that will allow 
reasonable attempts at human gene 
therapy." On 22 May 1989, Anderson 
and his colleagues initiated the first 
clinical trial using a gene therapy 
strategy. To date, gene therapy in gen¬ 
eral has not been particularly suc¬ 
cessful; however, as more information 
accumulates from ongoing studies, it 
is inevitable that it will become the 
standard mode of treatment for many 
diseases. 


the drug not being able to reach the targeted site in the whole animal at a 
concentration sufficient to be effective. Increasing the dose of a drug is not 
the answer to this problem, because high drug concentrations often have 
deleterious side effects. A number of different strategies are used to enhance 
the delivery of a drug to its target site. (1) Drugs may be encapsulated in 
liposomes, i.e., particles in which the drug is surrounded by a specific lipid 
surface, that can be targeted to certain organs. (2) Certain toxin genes may 
be incorporated into tumor-infiltrating lymphocytes. These cells can deliver 
the incorporated toxin directly to the site of a tumor. (3) A drug can be 
coupled to a monoclonal antibody that is specific for proteins found only 
on the surfaces of certain cells, e.g., tumor cells (Fig. 10.40). (4) A prodrug 
is an inert form of a drug that requires a specific enzyme to be activated. To 
ensure that the drug is released only in the vicinity of the target cells, the 
activating enzyme is coupled to a monoclonal antibody directed against 
specific cell surface antigens (Fig. 10.40). 

For this type of therapeutic system to be effective, the monoclonal anti¬ 
body or single-chain antibody that is complexed with the prodrug-con- 
verting enzyme must be available in quantity in a relatively pure form, 
bind to a protein that is highly specific to the target cell, be stable under 
physiological conditions but cleared rapidly from circulation, and, when 
necessary, be able to penetrate into tumor masses so that all of the cells can 
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FIGURE 10.40 Schematic representation 
of a monoclonal antibody-based drug 
delivery system. (A) The drug is cou¬ 
pled to a monoclonal antibody. (B) An 
enzyme that converts an inactive 
prodrug to an active drug is attached to 
a monoclonal antibody. The active drug 
is formed only in the immediate vicinity 
of the target cells. In both cases, the 
monoclonal antibody binds to a spe¬ 
cific protein on the surface of the target 
cell. 


be exposed to the drug. With this approach, only specifically targeted cells 
are exposed to the drug, permitting the use of a much lower concentration 
than if it were administered directly. 

Dual-Variable-Domain Antibodies 

In some instances, an antibody (particularly one that is conjugated to a 
toxin or radiochemical) is able to destroy a tumor or pathogen cell. In such 
cases, it is often advantageous to use antibody fragments, since the Fc por¬ 
tion of the molecule is not only not needed, it may impede or prevent the 
rest of the molecule from binding to relatively inaccessible antigens. 
Despite the usefulness of antibody fragments in a variety of applications, a 
major limitation of using them as therapeutic agents is that, since they lack 
the Fc portion of the molecule, they are unable to mount a complete 
immune response. To increase the utility of naturally existing antibodies, as 
well as to ensure that they are effective initiators of a complete immune 
response, researchers have created what they have termed "dual-variable- 
domain immunoglobulins" (Fig. 10.41). These constructs are essentially 
IgG molecules containing two tandem Fv regions, each with a different 
specificity. Dual-variable-domain immunoglobulins are bispecific and tet- 
ravalent, and they consist entirely of human immunoglobulin sequences. 
That is, each molecule contains four Fv regions, two identical Fv regions 
directed against one antigen and two Fv regions directed against another 
antigen. A dual-variable-domain immunoglobulin specific for both inter- 
leukin-12 and interleukin-18 produced in CHO cells showed binding to 
both of these cytokines, binding each with an affinity similar to that of the 
original monospecific antibody. In addition, in a biological assay, using a 
severe combined immunodeficient mouse model engrafted with human 
peripheral blood mononuclear cells, the interleukin-12/interleukin-18 anti¬ 
body was as effective at inhibiting induced IFN-y production as was a 
combination of the original two monospecific antibodies. Using this 
strategy, it should be possible to generate full-size bispecific antibodies for 
a variety of therapeutic purposes. 


FIGURE 10.41 Dual-variable-domain immunoglobulin directed against both inter- 
leukin-12 (IL-12) and interleukin-18. 











Protein Therapeutics 421 




FIGURE 10.42 Targeting tumor cells for destruction by monoclonal antibody-toxin 
conjugates. Tumor cells are first treated with the chemotherapeutic agent irino- 
tecan, which induces the synthesis of a unique cell surface protein. Then, a mono¬ 
clonal antibody directed against the cell surface protein and conjugated to a toxin 
molecule is added. After binding of the antibody to the cell surface, the toxin is 
internalized, thereby killing the tumor cell. 


Anticancer Antibodies 

A number of therapeutic antibodies that are directed against protein anti¬ 
genic determinants on the surfaces of cancer cells have been selected 
because the proteins are overexpressed compared to those on noncancerous 
cells. Unfortunately, (1) antibodies directed against these proteins may also 
bind to some noncancerous cells expressing the same or a similar antigen 
and (2) this approach presents researchers with only a limited number of 
targets for therapeutic antibodies. One way to select for additional cell sur¬ 
face targets would be to identify proteins whose expression is selectively 
induced in tumor cells exposed to chemotherapeutic drugs. When col¬ 
orectal cancer cells were treated with the drug irinotecan, which is a topoi- 
somerase inhibitor and is commonly used to treat this type of cancer, 
several newly synthesized proteins were found on the surfaces of those 
cells. (Topoisomerases are enzymes that unwind DNA during either DNA 
replication or mRNA transcription.) The new cell surface proteins were 
expressed early, prior to any major effects of the chemotherapeutic com¬ 
pound on cell viability. Monoclonal antibodies directed against one newly 
synthesized cell surface protein (called LY6D/E48) were generated, and 
then the antibodies were complexed with the cellular toxin auristatin E. 
The antibody-toxin conjugate was then used to treat tumor cells that were 
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first treated with irinotecan (Fig. 10.42). Following binding to the cell sur¬ 
face protein, the antibody-toxin conjugate was internalized inside the 
tumor cell. With this strategy, in six out of eight mice tumors disappeared 
entirely, while in the other two mice, the tumors were dramatically 
decreased in size. This exciting approach will have to be tried with larger 
numbers of animals before it can begin clinical trials. Flowever, provided 
that it is possible to identify one or more proteins that are specifically 
induced by chemotherapeutic agents and are not found on the surfaces of 
nontumor cells, this procedure could become a general strategy that is used 
to treat a variety of different types of human cancer. 


SUMMARY 


A large number of proteins that have potential as thera- 
ZApeutic agents have been synthesized from cloned genes in 
bacteria. Because most of these proteins are from eukaryotic 
organisms, the strategy for the isolation of a target gene often 
involves isolating mRNA enriched in the messenger of interest, 
synthesizing a cDNA library, and subcloning the selected 
target cDNA into an appropriate expression vector. In some 
instances, novel and useful variants of these proteins can be 
constructed either by shuffling functional domains of related 
genes or by directed replacement of functional domains of the 
cloned gene. In addition, long-acting and stable variants of 
some therapeutic proteins have been synthesized. 

In some instances, genetically engineered enzymes may be 
used as therapeutic agents. For example, both recombinant 
DNase I and alginate lyase have been used in an aerosol form 
to decrease the viscosity of the mucus found in the lungs of 
patients with cystic fibrosis. In addition, phenylalanine 
ammonia lyase may help patients with phenylketonuria as a 
replacement for phenylalanine hydroxylase, oq-antitrypsin 
may be used to limit some infections, and glycosidases may be 
utilized to convert blood groups A, B, and AB to type O. 


The development of recombinant DNA and monoclonal 
antibody technologies, combined with an understanding of 
the molecular structure and function of immunoglobulin mol¬ 
ecules, has provided specific antibodies as therapeutic agents 
to treat various diseases. Antibody genes can be readily 
manipulated because the various functions of an antibody 
molecule are confined to discrete domains. 

Drugs, prodrugs, or enzymes can be coupled to mono¬ 
clonal antibodies or Fv fragments that are specific for proteins 
found only on the surfaces of certain cells, e.g., tumor cells. 
These antibody-drug or antibody-enzyme combinations act 
as therapeutic agents. However, if the therapy requires mul¬ 
tiple treatments, the antibody component should be from a 
human source to prevent immunological cross-reactivity and 
sensitization of the patient. To achieve this, rodent monoclonal 
antibodies are "humanized" by substituting into human anti¬ 
bodies only the CDRs of the rodent monoclonal antibodies. In 
addition, it has become possible to produce and select human 
monoclonal antibodies in E. coli and in transgenic mice. 
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REVIEW QUESTIONS 


1. Before the sequencing of the human genome, how would 
you have cloned and expressed a cDNA sequence encoding 
human IFN? You do not have a DNA hybridization probe for 
human IFN, although you have isolated a human cell line that 
can be induced to synthesize IFN approximately 100-fold over 
background levels. Explain your strategy. 

2. What is the Fc portion of an antibody molecule? The Fab 
portion? The Fv portion? The CDR portion? 


3. How are antibody light and heavy chains coordinately syn¬ 
thesized in E. coli? 

4. How would you modify growth hormone to make it 
longer-acting? 

5. Why would DNase I and alginate lyase be useful for 
treating cystic fibrosis? 
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6. How is the production of alginate lyase from a cloned gene 
detected in E. coli transformants? 

7. What is a combinatorial cDNA library? 

8. How is bacteriophage M13 used to select Fv fragments that 
bind to specific target antigens? 

9. What are disulfide-stabilized and scFv molecules? 

10. How are enzymes coupled to monoclonal antibodies or Fv 
fragments used as therapeutic agents? 

11. How are mouse monoclonal antibodies "humanized"? 
Discuss the reasons for creating humanized monoclonal anti¬ 
bodies. 

12. Describe a protocol for producing a therapeutic agent that 
targets and kills a specific cell type. 

13. How would you engineer TNF-a to be a more specific and 
effective anticancer agent? 

14. What would you do to make interleukin-10 more effective 
for treating inflammatory bowel disease? 

15. How can the gene for DNase I be manipulated so that the 
enzyme becomes more effective for treating cystic fibrosis 
patients? 


16. How would you develop a strategy to protect at-risk 
women from HIV infection? 

17. How might low levels of phenylalanine be attained, other 
than with a phenylalanine-free diet, in patients with the 
human genetic disease phenylketonuria? 

18. What types of genetic manipulations can be used to gen¬ 
erate a very large bacterial library of highly specific single¬ 
chain human monoclonal antibodies? 

19. How would you engineer a mouse so that it produces 
only human antibodies? 

20. What is a bispecific diabody? 

21. How would you design a short peptide so that it retains 
the antigen-binding specificity of an entire immunoglobulin 
molecule? 

22. How would select antibodies against specific antigens in 
E. coli? 

23. What are dual-variable-domain immunoglobulins? 

24. How can you use a chemotherapy agent to facilitate the 
targeting of tumor cells with monoclonal antibodies? 


Nucleic Acids as 
Therapeutic Agents 



O ften, human disorders, such as cancer, inflammatory conditions, 
and both viral and parasitic infections, result from the overproduc¬ 
tion of a normal protein. Therapeutic systems using nucleotide 
sequences are being devised to treat these types of conditions. Theoretically, 
a small single-stranded nucleotide sequence (oligonucleotide) could 
hybridize to a specific gene or messenger RNA (mRNA) and diminish tran¬ 
scription or translation, respectively, thereby decreasing the amount of 
protein that is synthesized. An oligonucleotide that is designed to bind to 
a gene and block transcription is called an antigene oligonucleotide, and 
one that base pairs with a specific mRNA is called an antisense oligonucle¬ 
otide. The binding of an oligonucleotide to a transcription factor that is 
responsible for the expression of a specific gene could lower both transcrip¬ 
tion and translation of the target gene. Double-stranded oligonucleotides 
that attach to DNA-binding proteins could prevent the activation of tran¬ 
scription of specific genes. Also, some synthetic RNA/DNA molecules 
called aptamers that bind to proteins that are not naturally nucleic acid¬ 
binding proteins and prevent them from functioning can be created. 
Ribozymes, which are natural RNA sequences that bind and cleave specific 
RNA molecules, could be engineered to target an mRNA and subsequently 
decrease the amount of a particular protein that is synthesized. In addition, 
interfering RNAs, small double-stranded RNA molecules that direct the 
sequence-specific degradation of mRNA, may be used instead of either 
antisense RNAs (or oligonucleotides) or ribozymes. The potential for 
nucleic acid therapeutic agents is considerable and is just now beginning to 
be realized. 


Antisense RNA 

To be an effective therapeutic agent, an antisense RNA must bind to a 
specified mRNA and prevent translation of the protein (Fig. 11.1A). The 
possibility of using an expression vector to produce an antisense RNA that 
suppresses a pathogenic condition has been examined. For example, episo- 
mally based expression vectors that carry the complementary DNA (cDNA) 
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FIGURE 11.1 Inhibition of translation of specific mRNAs by antisense (AS) nucleic 
acid molecules. The promoter and polyadenylation regions are marked by p and pa, 
respectively; the intron is indicated by the letter A; and the exons are indicated by 
numbers (1 and 2). (A) A cDNA (AS gene) is cloned into an expression vector in 
reverse orientation, and the construct is transfected into a cell, where the AS RNA 
is synthesized. The AS RNA hybridizes to the target mRNA, and translation is 
blocked. (B) An AS oligonucleotide is introduced into a cell, and after it hybridizes 
with the target mRNA, translation is blocked. 
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FIGURE 11.2 A cDNA for human insulin¬ 
like growth factor 1 (ILGF-1) cloned on 
a vector in the antisense orientation 
under the transcriptional control of 
a metallothionein promoter (MTp). 
Following transfection into tumor- 
causing cells, when low levels of ZnS0 4 
are added, the cells have decreased 
tumorigenicity. The arrows above the 
gene and promoter indicate the normal 
direction of transcription. The origin of 
replication and other plasmid sequences 
have been omitted for clarity. 


sequence for either insulin-like growth factor 1 or insulin-like growth factor 
1 receptor were constructed with the cloned sequences oriented so that the 
transcripts were antisense rather than mRNA (sense) sequences. Insulin¬ 
like growth factor 1 is prevalent in malignant glioma, which is the most 
common form of human brain tumor. Excess production of insulin-like 
growth factor 1 receptor occurs in prostate carcinoma, which is a significant 
type of cancer in males. In both vectors, the reverse-oriented cDNAs are 
under the control of the metallothionein promoter, which is induced by low 
levels of ZnS0 4 . 

Cultured glioma cells were transfected with the vector that produces 
the antisense version of the insulin-like growth factor 1 mRNA. In the 
absence of ZnS0 4 , the tumorous properties were retained; in contrast, when 
ZnS0 4 was added to the culture medium, these distinctive features were 
lost (Fig. 11.2). In another experiment, nontransfected glioma cells caused 
tumors after they were injected into rats, whereas glioma cells that had 
been transfected with antisense insulin-like growth factor 1 cDNA did not 
develop tumors. 

When mice were injected with rat prostate carcinoma cells that were 
transfected with the insulin-like growth factor 1 receptor cDNA in the anti- 
sense orientation, they developed either small or no tumors, whereas large 
tumors were formed when mice were treated with either nontransfected or 
control-transfected rat prostate carcinoma cells. It was assumed that in both 
cases the antisense RNA hybridized with its complementary mRNA 
sequence and hindered translation of insulin-like growth factor 1 and 
insulin-like growth factor 1 receptor, thus preventing the proliferation of 
the cancer cells. 

Antisense Oligonucleotides 

The sequence-specific effectiveness of chemically synthesized antisense 
oligodeoxynucleotides (Fig. 11.IB) relies on hybridization to an accessible 
nucleotide sequence on the target mRNA, resistance to degradation by cel¬ 
lular nucleases, and ready delivery into cells. Oligonucleotides with about 
15 to 24 nucleotides have sufficient specificity to hybridize to a unique 
mRNA. Potential mRNA target sites are determined by testing a set of anti- 
sense oligonucleotides with cells in culture that produce the target mRNA. 
Those antisense oligonucleotides that diminish the translation of the speci¬ 
fied protein are selected. Proteomic analysis of cellular proteins that are 
labeled with fluorescent dyes during translation can be used to determine 
if the production of a particular protein is reduced in the presence of an 
antisense oligonucleotide. There are no general rules for predicting the best 
target sites in various RNA transcripts. Antisense oligonucleotides that are 
directed to the 5' and 3' ends of mRNAs, intron-exon boundaries, and 
regions that are naturally double stranded have all been effective. 

Since oligodeoxynucleotides are susceptible to degradation by intracel¬ 
lular nucleases, it was important to find ways to synthesize molecules that 
are resistant to attack by nucleases without affecting the ability of the anti- 
sense oligonucleotide to hybridize to a target sequence. To this end, the 
backbone, pyrimidines, and sugar moiety have been modified (Fig. 11.3). 
Currently, the most extensively used antisense oligonucleotide has a sulfur 
group in place of the free oxygen of the phosphodiester bond (Fig. 11.3B). 
This modification is called a phosphorothioate linkage. Phosphorothioate 
antisense oligonucleotides are water soluble, polyanionic, and resistant to 
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endogenous nucleases. In addition, when a phosphorothioate antisense 
oligonucleotide hybridizes to its target site, the RNA-DNA duplex acti¬ 
vates the endogenous enzyme ribonuclease H (RNase H), which cleaves 
the mRNA component of the hybrid molecule. Clinical trials with several 
phosphorothioate antisense oligonucleotides, which are considered to be 
"first-generation" therapeutic agents, have been initiated. Second- 
generation antisense oligonucleotides typically contain alkyl modifications 
at the 2' position of the ribose (Fig. 11.3E) and are generally less toxic and 
more specific than phosphorothioate-modified molecules. Third-generation 
antisense oligonucleotides contain a variety of modifications within the 
ribose ring and / or the phosphate backbone, as well as being less toxic than 
either first- or second-generation antisense oligonucleotides. One phospho¬ 
rothioate antisense oligonucleotide has been approved by the U.S. Food 
and Drug Administration (FDA) to treat cytomegalovirus infections of the 
retina in patients with acquired immune deficiency syndrome (AIDS). This 
particular antisense oligonucleotide, called fomivirsen and sold as Vitravene, 
is administered by injection of 330 pg in a volume of 50 pi directly into an 
affected eye after the application of a topical or local anesthetic. Fomivirsen 
treatment is typically once every 2 weeks for 4 weeks, followed by once 
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T he nematode worm 

Caenorhabditis elegans is studied 
as a model eukaryotic organism 
in part because the strains are inex¬ 
pensive to breed and can be frozen. 
When the cells are thawed, they 
remain viable, allowing long-term 
storage. C. elegans has the advantage 
of being a multicellular eukaryotic 
organism that is simple enough to be 
studied in great detail. The develop¬ 
mental fate of every single somatic cell 
(959 in the adult hermaphrodite; 1,031 
in the adult male) has been mapped 
out, and these patterns of cell lineage 
are largely invariant between individ¬ 
uals. In addition, C. elegans is one of 
the simplest organisms with a nervous 
system. 

In the late 1990s, Andrew Fire, 
Craig Mello, and their colleagues were 
investigating how gene expression is 
regulated in C. elegans. When they 
injected worms with mRNA molecules 
encoding a C. elegans muscle protein, 
they did not observe any changes in 


the behavior of the worms. Injecting 
the antisense version of this mRNA 
also had no effect. However, when 
they injected sense and antisense 
RNAs together, they observed that the 
worms displayed peculiar twitching 
movements. Similar movements were 
seen in worms that completely lacked 
a functioning gene for the muscle pro¬ 
tein. 

Somehow the added double- 
stranded RNA molecule was silencing 
the expression of the gene carrying the 
same genetic information as that par¬ 
ticular RNA. When double-stranded 
RNA molecules containing portions of 
the mRNA sequences for several other 
worm proteins were injected, the 
expression of these genes was also 
silenced. 

From these experiments. Fire and 
Mello deduced that double-stranded 
RNA can silence genes, that this RNAi 
is specific for the gene whose 
sequence matches that of the injected 
RNA molecule, and that RNAi can 


spread between cells and even be 
inherited. In addition, since the injec¬ 
tion of even tiny amounts of double- 
stranded RNA was sufficient to 
achieve an effect. Fire and Mello pro¬ 
posed that RNAi is a catalytic process. 

Fire and Mello's discovery clarified 
many earlier confusing and appar¬ 
ently contradictory experimental 
observations and revealed a natural 
mechanism for controlling the flow of 
genetic information. Soon after this 
original report, other workers found 
interfering RNAs in a number of dif¬ 
ferent systems from worms to mam¬ 
mals to plants. This work opened up a 
whole new field of research. Workers 
soon discovered that RNAi can regu¬ 
late gene expression in hundreds of 
genes in our genome and that these 
small RNAs play an important role in 
animal and plant development and 
the control of cellular functions. RNAi 
also appeared to protect the genome 
against transposons and viruses, and 
it opened up exciting possibilities for 
use as a therapeutic agent. In 2006, 
Fire and Mello received the Nobel 
Prize in Physiology or Medicine for 
their pioneering work. 
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FIGURE 11.3 Modifications to antisense oligonucleotides. (A) Phosphodiester linkage; 
(B) phosphorothioate linkage; (C) phosphoramidite linkage; (D) polyamide linkage 
(peptide nucleic acid); (E) 2'-0-methyl ribose; (F) C-5 propynylcytosine. 


every 4 weeks. Before treatment with fomivirsen is started, it is essential 
that the presence of cytomegalovirus be absolutely confirmed, since several 
other infective agents produce similar symptoms. 

Antisense oligonucleotides with phosphoramidate and polyamide 
(peptide) linkages have been synhesized in the expectation that these mol¬ 
ecules should be very resistant to nuclease degradation (Fig. 11.3C and D). 
Furthermore, as mentioned above, chemical groups have been added to the 
2' carbon of the sugar moiety and the 5 carbon (C-5) of pyrimidines to both 
enhance stability and facilitate the binding of the antisense oligonucleotide 
to its target site (Fig. 11.3E and F). 

In one set of experiments, phosphoramidite antisense oligonucleotides 
were delivered by injecting muscle, followed by a short (less than a second) 
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electrical pulse. In this case, the antisense oligonucleotide formed highly 
stable duplexes with the target RNA and did not induce RNase H activity. 
Since only oligonucleotides targeted to the 5' untranslated region, and not 
to the coding portion, of the mRNA were inhibitory, it was surmised that 
phosphoramidite antisense oligonucleotides prevented translation of the 
target mRNA. 

Several preclinical trials have shown the usefulness of antisense oligo¬ 
nucleotides as therapeutic agents. For example, the narrowing (stenosis) of 
coronary and carotid arteries that leads to heart attacks and strokes, respec¬ 
tively, is often alleviated by angioplasty, which is a procedure that widens 
arteries by the insertion of an inflated balloon. However, arterial blockage 
recurs (restenosis) in about 40% of patients within 6 months because angio¬ 
plasty induces a healing reaction, which stimulates the proliferation of 
smooth muscle cells and the secretion of an extracellular matrix in the inner 
layer of the artery at the site of the treatment. When phosphorothioate anti- 
sense oligonucleotides that targeted mRNAs for proteins that are essential 
for the mammalian cell cycle were applied to rat carotid arteries after 
angioplasty, restenosis was reduced by about 90%. In addition to postan¬ 
gioplasty restenosis, smooth muscle cell proliferation is implicated in ath¬ 
erosclerosis, hypertension, diabetes mellitus, and the failure of coronary 
bypass grafts. Presumably, these conditions might be controlled by similar 
antisense therapeutics. 

In another study, a 20-nucleotide phosphorothioate antisense oligonu¬ 
cleotide complementary to the coding region of human apolipoprotein B 
was used to lower the level of low-density lipoprotein cholesterol in 
humans. High levels of low-density lipoprotein cholesterol have long been 
considered a significant risk factor for cardiovascular disease; high levels of 
apolipoprotein B are also likely associated with cardiovascular risk. 
Apolipoprotein B, which is produced in the liver, is an essential structural 
and receptor-binding component of all atherogenic (plaque-causing) lipo¬ 
proteins. It plays a key role in low-density lipoprotein cholesterol transport 
and removal. Typically, statins, the most prescribed drug class in the world, 
are used to lower low-density lipoprotein cholesterol levels. While statins 
are effective for many individuals, some people continue to have high 
levels of both low-density lipoprotein cholesterol and apolipoprotein B. 
Therefore, a 20-nucleotide phosphorothioate antisense oligonucleotide 
complementary to the coding region of human apolipoprotein B was devel¬ 
oped to be an adjunct to statin treatment. While this antisense oligonucle¬ 
otide has been tested on only 36 individuals (Fig. 11.4), the results to date 


FIGURE 11.4 Dosing regimen of an antisense oligonucleotide designed to lower low- 
density lipoprotein cholesterol. The entire procedure took approximately 10 weeks. 
The arrows below the horizonal line indicate the times when the antisense oligo¬ 
nucleotide was administered (subcutaneously). Doses of the antisense oligonucle¬ 
otide ranged from 50 to 400 mg per injection. 
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FIGURE 11.5 Schematic representation of 
a liposome carrying a nucleic acid. 


are encouraging, and there is an expectation that this approach may lead to 
significant reductions in adverse cardiovascular events. 

Antisense oligonucleotides have been tested to determine if they can 
control psoriasis, a disease of uncontrolled epidermal growth that causes 
red scaly, itchy patches to appear on the skin. Insulin-like growth factor I 
has been implicated in the pathogenesis of psoriasis because insulin-like 
growth factor I receptors are present in excess in psoriatic lesions. Thus, 
antisense oligonucleotide lowering of the mRNA for insulin-like growth 
factor I receptor might form the basis of a psoriasis therapy. In preliminary 
experiments, all of the different insulin-like growth factor receptor anti- 
sense oligonucleotides tested were 15 nucleotides in length. The potential 
antisense oligonucleotides were transfected into keratinocytes by using 
liposomes to facilitate antisense oligonucleotide cellular uptake (Fig. 11.5), 
and then the level of insulin-like growth factor I receptor mRNA was 
assessed. The three most active antisense oligonucleotides reduced the 
insulin-like growth factor I receptor protein concentration by 45 to 65%, 
while a random oligonucleotide had no effect on the amount of the protein. 
The selected antisense oligonucleotides were tested with athymic nude 
mice carrying human psoriatic lesional grafts. When the grafts were 
injected every 2 days for 20 days with antisense oligonucleotides comple¬ 
mentary to the insulin-like growth factor I receptor mRNA, there was a 
significant reduction (58 to 69%) in both epidermis thickness and the cross- 
sectional area of the skin lesions. This result is very encouraging and sug¬ 
gests that skin diseases in which a normal protein is overproduced may be 
appropriate targets for antisense oligonucleotides that can be delivered 
topically. 

Antisense oligonucleotides have also been used to inhibit the synthesis 
of the transcription factor forkhead box Ol (also called FOXOl) in mice. 
This protein increases the expression of phosphoenolpyruvate carboxyki- 
nase (PEPCK) and glucose-6-phosphatase (G6Pase), both of which are key 
enzymes in gluconeogenesis. Antisense oligonucleotides cause a reduction 
in the expression of these genes (approximately 50 to 60%) in both liver and 
fat tissues, but not in cardiac or skeletal muscle. Thus, the introduction of 
antisense oligonucleotides complementary to FOXOl mRNA essentially 
mimics insulin action and therefore may bypass some of the defects in 
insulin signaling common among diabetics. To choose the most effective 
antisense oligonucleotide sequence, 80 different 20-nucleotide-long 
sequences complementary to various portions of mouse FOXOl mRNA 
were tested, using mouse primary hepatocytes in culture, for the ability to 
inhibit FOXOl mRNA expression. The positive effects that were reported 
represent the results of studies with the most effective of the original 80 
oligonucleotides. It will now be of interest to determine whether the prom¬ 
ising results observed in mice can be extended to humans. 

Aberrant splicing of an mRNA occurs when a mutation in an intron is 
recognized by the RNA-processing system as an authentic splice site, and 
consequently, a portion of the intron is included as part of the processed 
mRNA (Fig. 11.6A). The presence of part of an intron disrupts the reading 
frame, and a truncated protein is produced. As a result, a disease condition 
may result from a diminished level of normal protein. 

It was reasoned that an antisense oligonucleotide that targeted an aber¬ 
rant splice site could likely prevent splicing at that site and increase the 
number of joining events between the correct intron-exon splice sequences. 
This notion was tested with a splice mutation in the second intron of the 
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FIGURE 11.6 Correction of a mutant splice site with an antisense oligonucleotide. (A) 
The outcome of a mutation in a splice site is depicted. The numbers denote exons. 
The first intron is marked with the letter A, and the second intron contains a splice 
mutation (red dot) that divides the intron into two parts (B1 and B2). The dotted 
lines span the RNA segments that are removed during RNA processing. There are 
two possible splicing events: pathway a leads to a functional mRNA, and pathway 
b leads to an RNA that includes part of the second intron (B2). (B) An antisense oli¬ 
gonucleotide (AS) (shown as a red bar) that binds to the mutant splice site prevents 
RNA processing at the site, and consequently, only functional mRNA is produced. 


P-globin gene (Fig. 11.6B). This mutation is responsible for one form of 
p-thalassemia, which is an inherited blood disorder that leads to loss of red 
blood cells (anemia). After cells that are homozygous for the intron 2 splice 
site mutation were transfected with a 2'-0-methyl phosphorothioate anti- 
sense oligonucleotide that targeted the mutant splice site, the number of 
normal p-globin chains was increased by about 50%, which theoretically 
would be beneficial to patients with this genetic defect. Further studies are 
required to determine if antisense rectification of splice site mutations is an 
effective therapeutic strategy for thalassemia and other conditions due to 
similar mutations. 

It has recently been shown that it is possible to protect mice against 
retroviruses by injecting them, intravenously or intraperitoneally, with 
phosphorothioated antisense oligonucleotides that prevent the conversion 
of the viral RNA genome into double-stranded DNA. In this system, the 
added antisense oligonucleotide effectively blocks replication of the retro¬ 
virus. When an added antisense oligonucleotide binds to the junction of the 
polypurine tract (which is present in a broad range of different retrovi¬ 
ruses) and the U3 element, a structure is formed which mimics the normal 
substrate for the virus-encoded enzyme RNase Ft (Fig. 11.7). This causes 
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FIGURE 11.7 Schematic representation of the reverse transcription of retroviral RNA 
(red) to produce double-stranded DNA (the minus strand is light blue, and the plus 
strand is dark blue). In step 1, a transfer RNA (tRNA) (shown as a blue circle) 
primes synthesis of the minus strand. The RNA shows U3 and U5 elements and a 
polypurine tract (PPT). In step 2, a complete RNA-DNA duplex is formed. In step 
3, RNase H digests the viral RNA in the RNA-DNA duplex into small pieces (the 
arrows indicate digestion sites). In step 4, a double-stranded DNA copy of the viral 
RNA is produced, which can exist as a provirus integrated into a cell's DNA. To 
block the formation of the minus strand and hence the synthesis of a double- 
stranded DNA version of the virus, an antisense oligonucleotide that hybridizes to 
the PPT region is added. 



premature cleavage of the viral RNA, resulting in the virus being destroyed 
before reverse transcription occurs. This strategy has been shown to be 
effective in protecting mice from retroviruses. In principle, it should also 
work in humans and on a range of different retroviruses. However, a 
number of technical obstacles must be overcome before this approach is 
ready for testing in humans. 


Ribozymes 

Ribozymes are naturally occurring catalytic RNA molecules (RNA metal- 
loenzymes) that are ~40 to 50 nucleotides in length and have separate cata¬ 
lytic and substrate-binding domains. Compared with protein therapeutics, 
an important advantage of ribozymes is that they are unlikely to evoke an 
immune response in a treated animal or human. The substrate-binding 
sequence combines by nucleotide complementarity and, possibly, non- 
hydrogen-bond interactions with its target sequence. The catalytic portion 
cleaves the target RNA at a specific site. By altering the substrate-binding 
domain, a ribozyme can be engineered to specifically cleave any mRNA 
sequence (Fig. 11.8). For therapeutic purposes, either hammerhead or 
hairpin ribozymes—named after the appearance of their secondary struc¬ 
ture that results from intrastrand base pairing—may be used. However, 
some workers have suggested that hammerhead ribozymes are preferable 
because of their ability to more efficiently recognize, bind to, and cleave a 
range of different mRNAs. 

In practice, an indirect strategy is often used for creating a therapeutic 
ribozyme, since the large-scale production of synthetic RNA molecules is 
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FIGURE 11.8 Two-dimensional representation of hammerhead (A) and hairpin (B) 
ribozyme-mRNA substrate complexes. The mRNA substrates and ribozymes are 
shown in red and blue, respectively. A, adenosine; C, cytosine; G, guanosine; U, 
uridine; Y, a pyrimidine nucleotide (C or U); R, a purine nucleotide (A or G); H, any 
nucleotide except G; B, any nucleotide except A; V, any nucleotide except U; N and 
N', any complementary nucleotides. The arrows indicate the points of mRNA 
cleavage. 


difficult and RNA molecules are susceptible to degradation after delivery 
to a target cell. One approach to overcome these drawbacks entails chemi¬ 
cally synthesizing a double-stranded oligodeoxyribonucleotide with a 
ribozyme catalytic domain (~20 nucleotides) flanked by sequences that 
hybridize to the target mRNA after it is transcribed. The double-stranded 
form of the ribozyme oligodeoxyribonucleotide is cloned into a eukaryotic 
expression vector (usually a retrovirus). Cells are transfected with the con¬ 
struct, and the transcribed ribozyme cleaves the target mRNA, thereby 
suppressing the translation of the protein that is responsible for a disorder. 
Since most of the vectors that have been used cannot infect nondividing 
cells, target cells may be removed from a patient and then grown and trans¬ 
fected in culture before they are returned to the original tissue. 

As an alternative to intracellular ribozyme production, ribozymes may 
be delivered directly to cells by injection or with liposomes, i.e., endoge¬ 
nous delivery. Directly delivered ribozymes may be chemically modified to 
protect them from rapid breakdown by nucleases. For example, the 2 
hydroxyl groups may be modified by alkylation or by substitution with 
either an amino group or a fluorine atom. These modifications increase the 
half-life of ribozymes in serum from minutes to days. 

Under laboratory conditions, ribozymes can inhibit the expression of a 
variety of viral genes and significantly inhibit the proliferation of numerous 
organisms. For example, in cell culture, ribozymes inhibit the expression of 
(1) human cytomegalovirus transcriptional regulatory proteins, resulting 
in a 150-fold decrease in viral growth; (2) human herpes simplex virus type 
1 transcriptional activator, resulting in a reduction of around 1,000-fold in 
viral growth; and (3) a reovirus mRNA encoding a protein required for 
viral proliferation. Moreover, a hammerhead ribozyme was designed to 
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treat collagen-induced arthritis in mice. A plasmid-encoded ribozyme 
directed against the mRNA for tumor necrosis factor alpha, which is 
involved in rheumatoid arthritis, was intravenously injected into affected 
mice. Following this treatment, there was a significant reduction in the 
level of the tumor necrosis factor alpha mRNA, as well as a decrease in the 
collagen-induced arthritis. This type of ribozyme, used so far only with 
mice, has the potential to become a human therapeutic agent. 

The development of resistance in humans to various chemical treat¬ 
ments is a persistent problem for the pharmaceutical industry. Generally, a 
single mutation that alters the target site is sufficient to void the action of a 
drug. However, with appropriate ribozyme-based therapeutics, ribozymes 
for a number of different sites could be used simultaneously, thereby 
cleaving an mRNA at different sites. The ability to cleave multiple sites on 
a single viral gene should make it less likely that any single viral mutation 
will confer resistance. 

Deoxyribozymes 

To date, no naturally occurring DNA equivalent of ribozymes, i.e., DNA 
enzymes (deoxyribozymes), has been discovered. However, oligodeoxynu- 
cleotides with catalytic activity have been synthesized. The best character¬ 
ized and most studied of these deoxyribozymes is 10-23 RNase (Fig. 11.9). 
As a therapeutic agent, a catalytic oligodeoxynucleotide has some advan¬ 
tages over a ribozyme. DNA is approximately 1,000-fold more stable against 
hydrolytic destruction than protein and is nearly 100,000-fold more stable 
than RNA. In addition, deoxyribozymes are more efficient at binding and 
cutting mRNAs than are ribozymes. However, a deoxyribozyme cannot be 
produced continuously after the vector that encodes it is introduced into a 
particular tissue, because only ribozymes are produced from the DNA 
sequence. Therefore, deoxyribozymes must be delivered directly to affected 
cells. Proof-of-principle experiments have shown that deoxyribonucleotides 
are effective with cells in culture. For example, deoxyribozymes have been 
used to cleave mRNA transcribed from the growth-stimulating gene myc, 
which limits the growth of leukemia cells in culture. A deoxyribozyme has 
also been used to prevent mRNA from a gene called Egr-1 from being 
expressed. Production of this mRNA is one reason for the failure of angio¬ 
plasty—a procedure in which a balloon at the end of a catheter is used to 

FIGURE 11.9 Two-dimensional representation of deoxyribozyme 10-23 RNase-mRNA 
substrate complex. The mRNA substrate and deoxyribozyme are shown in red and 
blue, respectively. A, adenosine; C, cytosine; G, guanosine; U, uridine; Y, a pyrimi¬ 
dine nucleotide (C or U); R, a purine nucleotide (A or G); N and N', any comple¬ 
mentary nucleotides. The arrow indicates the point of mRNA cleavage. 


Substrate 


5'...NNNNNNN R Y NNNNNN ...3' 




3'... N'N'N'N'N'N'N' 
A 


R N'N'N'N'N'N'... 5' 


G 

G 

C 

T 


Deoxyribozyme G 


C 

A 

A 

C 


A 

G 



Nucleic Acids as Therapeutic Agents 437 


unclog arteries that contain atherosclerotic plaques. When the Egr-1 mRNA 
is expressed, the recently unclogged artery is rapidly closed. 


Chimeric RNA-DNA Molecules 

The ability to convert a mutant base pair of a gene to the wild-type (normal, 
or correct) version would reverse the consequences of many different 
genetic conditions. A strategy using a modified RNA-DNA oligonucleotide 
with 68 nucleotides (chimeric oligonucleotide) has been devised for this 
purpose. The composition of the chimeric oligonucleotide includes a single 
mixed oligonucleotide with ribonucleotides and deoxyribonucleotides in a 
duplex conformation with hairpin caps at the ends of the complementary 
segments and methylation of the oxygen of the 2' carbon of the ribose 
sugars (Fig. 11.10). The rationale for this particular arrangement is based on 
various experimental observations. First, combined RNA-DNA molecules 
participate more readily than duplex DNA in homologous nucleic acid 
pairing reactions. Second, hairpin caps, which do not interfere with the 
pairing of homologous nucleic acid molecules, protect the molecule from 
exonucleases. Third, the 2'-0 methylation of the ribose units shields the 
molecule from degradation by RNase H. In addition, the organization of 
the nucleotides of a chimeric oligonucleotide is important. Ten ribonucle¬ 
otides flank a central core of five deoxyribonucleotides, and except for the 
correct base pair, this segment of the chimeric oligonucleotide has the same 
sequence as the target. 

In cell culture, the feasibility of base pair correction with a chimeric 
oligonucleotide was examined with both a mutated cDNA sequence car¬ 
ried by a plasmid and a mutant site within a chromosomal sequence. In 
both instances, with high frequencies, the mutant sites were replaced by the 
correct base pair. Flowever, more studies are required before chimeric oli¬ 
gonucleotides will become effective therapeutic agents. 


Aptamers 

Aptamers are nucleic acid sequences, RNA or DNA, that bind tightly to 
proteins, amino acids, drugs, or other molecules. They are typically 15 to 40 


FIGURE 11.10 Correction of a single-base-pair mutation by a chimeric oligonucle¬ 
otide. The double arrow points to the mutant site in the target sequence and the 
correct base pair in the chimeric oligonucleotide. The mutant and correct nucle¬ 
otides are underlined. The uppercase letters represent deoxyribonucleotides, and 
the lowercase letters represent ribonucleotides. The nucleotides of the hairpin caps 
are shown in red. The vertical line and the 3' and 5' designations mark the 3' and 5' 
ends of the chimeric oligonucleotide. Adapted from Yoon et al., Proc. Natl. Acad. Sci. 
USA 93:2071-2076,1996. 

Target sequence 

- ACCCCCAGCGCCGCCTACACCCACTCGGCTGACCGG - 

- TGGGGGTCGCGGCGGATGTGGGTGAGCCGACTGGCC - 


p TGCGCGu cgcggcgga TGCGGg ugagccgacT p 

T tcgcgc|agcgccgcctacgcccactcggctgt t 

3 ' 5' Chimeric oligonucleotide 
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FIGURE 11.11 Overview of the SELEX procedure for selecting aptamers with high 
affinity to a target molecule (often a protein). The selected aptamers are typically 
cycled through this procedure 5 to 15 times. 


nucleotides long, have highly organized secondary and tertiary structures, 
and bind with high affinity (10 12 < K d < 10 9 , where K d is the dissociation 
constant) to their target molecules. Aptamers are attractive as potential 
therapeutic agents because of their high specificity, relative ease of produc¬ 
tion, low or no immunogenicity, and long-term stability. 

Aptamers that are directed against specific targets are typically selected 
by a procedure known as SELEX (systematic evolution of /igands by expo¬ 
nential enrichment), in which DNA or RNA ligands that bind to the target 
molecule are selectively enriched (Fig. 11.11). In this procedure, a random 
DNA sequence is cloned between two particular DNA sequences. The 3' 
region contains an attachment site for reverse transcriptase primers, and 
the 5' region contains an attachment site for a polymerase chain reaction 
(PCR) primer. The double-stranded DNA is converted to RNA using T7 
RNA polymerase. The SELEX procedure combines several rounds of 
binding, partitioning, and amplification of selected nucleotide sequences 
from an initial pool of up to 10 16 nucleotide sequence variants. The end 
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TABLE 11.1 Some proteins against which aptamers have been 
generated and the affinities of the aptamers for the proteins 


Protein 

K d (nM) 

Keratinocyte growth factor 

0.0003 

HIV type 1 reverse transcriptase 

0.02 

Transforming growth factor (31 

0.03 

P-Selectin 

0.04 

VEGF receptor 

0.05 

Platelet-derived growth factor 

0.09 

Immunoglobulin E 

0.1 

Extracellular signal-regulated kinase 

0.2 

CD4 antigen 

0.5 

HIV type 1 RNase H 

0.5 

Factor IXa 

0.58 

Angiogenin 

0.7 

Complement factor 5 

1.0 

Transforming growth factor (32 

1.0 

Secretory phospholipase A2 

2.0 

Thrombin 

2.0 

Angiopoietin 2 

2.2 

y-Interferon 

2.7 

L-Selectin 

3.0 

Human neutrophil elastase 

5.0 

Tenascin C 

5.0 

Integrin 

8.0 

Hepatitis C virus NS3 protease 

10.0 

Factor Vila 

11.0 

Yersinia pestis tyrosine phosphatase 

18.0 

Anti-insulin receptor antibody MA20 

30.0 

Trypanosoma cruzi cell adhesion receptor 

172.0 


result of this procedure is the selection of aptamers that bind to the target 
molecule with high affinity. Ultimately, the SELEX procedure yields one (or 
just a few) unique nucleic acid sequence(s) from the original mixture with 
high affinity for the target molecule. To make aptamers less sensitive to 
nuclease digestion, OH residues at the 2' positions of purines may be 
replaced with 2'-0-methyl residues. In addition, aptamers may be capped 
at their 3' end with a deoxythymidine residue. Table 11.1 lists some of the 
proteins against which aptamers have been generated, as well as the range 
of affinities of the aptamer for the target protein. 

An aptamer known as pegaptanib received approval from the U.S. FDA 
in December of 2004 for use as a human therapeutic agent. Pegaptanib is a 
30-nucleotide-long aptamer that targets vascular endothelial growth factor 
(VEGF) and binds to the protein with extremely high affinity (K d = 0.05 nM). 
This secreted protein promotes the growth of new blood vessels by stimu¬ 
lating the endothelial cells that not only form the walls of the blood vessels, 
but also transport nutrients and oxygen to the tissues. When retinal pigment 
epithelial cells begin to senesce from lack of nutrition (ischemia), VEGF acts 
to stimulate the synthesis of new blood vessels (neovascularization). 
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Heparin-binding 

domain 


FIGURE 11.12 Schematic representation 
of protein VEGF, which contains both a 
receptor-binding domain and a hep¬ 
arin-binding domain. 


However, this process is imperfect, and often the blood vessels do not form 
properly so that leakage results, causing scarring in the macular region of 
the retina with the eventual loss of central vision. These physiological 
changes contribute to age-related macular degeneration, a leading cause of 
blindness. Pegaptanib was selected to bind to one of the four isoforms of 
VEGF (VEGF 165 ) that is responsible for age-related macular degeneration. 
The drug is injected directly into the eye every 6 weeks, or about nine times 
a year. All four forms of VEGF have a receptor-binding domain (Fig. 11.12), 
while only VEGF 165 has a heparin-binding domain, which is the specific 
target for pegaptanib. Pegaptanib is a new type of therapeutic agent with an 
unusual specificity that can effectively suppress age-related macular degen¬ 
eration—to date, approximately 95% of the patients receiving pegaptanib 
were 65 years of age or older. 

The safety of aptamers used in clinical trials is a concern, especially 
when the optimal dose of a particular aptamer is not known. One way to 
overcome this problem is through the use of aptamer "antidotes." These 
molecules consist of short oligonucleotides whose sequences are comple¬ 
mentary to the aptamers being tested. When antidotes are added to 
aptamers, they hybridize to the aptamers and inhibit their binding to the 
clinical target. 


Interfering RNAs 

Principles 

The addition of double-stranded RNA to animal and plant cells reduces the 
expression of the gene from which the double-stranded RNA sequence is 
derived. This "gene silencing," which specifically reduces the concentra¬ 
tion of a target mRNA by up to 90%, is reversible, since there is no change 
in the target cells' DNA. This phenomenon has been termed RNA interfer¬ 
ence (or "RNAi") and occurs naturally in virtually all eukaryotic organ¬ 
isms. RNAi appears to be the same phenomenon as gene silencing in 
animals or cosuppression in plants. Although all of its biological roles 
remain to be established, RNAi may protect both animals and plants from 
viruses and from the accumulation of transposons. A working model for 
RNAi has been formulated based on experimental analyses (Fig. 11.13). 
Following the introduction of a double-stranded RNA molecule into a cell, 
the double-stranded RNA is cleaved by the RNase Ill-like enzyme Dicer 
into single-stranded pieces of RNA, approximately 21 to 23 nucleotides in 
length, that have been called small interfering RNAs (siRNAs). The anti- 
sense strand of an siRNA is incorporated into an RNA-induced silencing 
complex (RISC) that binds to and then cleaves the mRNA. The specific 
binding of the siRNA to the mRNA that occurs is based on the complemen¬ 
tarity of the two RNA sequences. The site of cleavage of the targeted mRNA 
is between nucleotides 10 and 11 relative to the 5' end of the siRNA (anti- 
sense) guide strand. Consistent with this model, the transfection of mam¬ 
malian cells in culture with duplexes of 21-nucleotide RNA can also 
mediate RNAi. 

Despite the fact that many aspects of RNAi are still not completely 
understood, it could form the basis for new therapeutic agents. The use of 
short RNA duplexes may eventually provide an alternative approach to the 
use of antisense oligonucleotides or ribozymes. 
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The phenomenon of RNAi is expected to facilitate the development of a 
wide range of antiviral compounds and therapies utilizing specific siRNAs 
delivered to the appropriate target cell. Similarly, the expression of endog¬ 
enous eukaryotic genes may be inhibited by plasmid-driven expression of 
short hairpin RNAs (shRNAs), which are similar in structure to the micro- 
RNAs that often normally regulate gene expression in eukaryotic cells. In 
fact, there are a number of reports of the use of either siRNA or shRNA to 
suppress virus replication in tissue culture (Table 11.2). Moreover, it has 
been successfully demonstrated that RNAi is effective in vivo (with mice), 
suggesting that, in principle, all viruses may be inactivated by RNAi. 

Independently of how an siRNA or shRNA is introduced into a cell, it 
may have nonspecific effects. For example, introduction of these molecules 
may inadvertently activate innate cellular immune responses, such as the 
interferon response. In addition, siRNA or shRNA may also be complemen¬ 
tary to nontarget mRNAs. However, several experimental approaches may 
be utilized to avoid these problems. (1) Off-target effects are most often 
observed when the siRNA or shRNA concentration is >100 nM. By lowering 
the concentration as much as possible (often to 20 nM or less), off-target 

FIGURE 11.13 Overview of the process of RNA interference. Following introduction of 
double-stranded RNA (dsRNA) into a cell, the Dicer complex binds to the RNA and 
cleaves it into an siRNA containing approximately 21 bp. The antisense strand (red) 
becomes part of the RISC, directing the cleavage of the complementary mRNA. 
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TABLE 11.2 Some examples of sup¬ 
pression of viral replication in tissue 
culture by RNAi 

Severe acute respiratory syndrome- 
associated coronavirus 
Hepatitis C virus 
West Nile virus 
Coxsackievirus B3 
Foot-and-mouth disease virus 
Hepatitis A virus 
Human rhinovirus 6 
Poliovirus 

Respiratory syncytial virus 

Human parainfluenza virus 3 

Vesicular stomatitis virus 

Influenza virus 

Hepatitis delta virus 

Rotavirus 

HIV type 1 

Hepatitis B virus 

Herpes simplex virus type 1 

Human cytomegalovirus 

Epstein-Barr virus 

Human herpesvirus 6B 

Murine herpesvirus 68 

Human papillomavirus type 18 

JC virus 


effects are often avoided. (2) Since the interferon response can be induced by 
double-stranded RNAs with as few as 11 bp that are perfectly complemen¬ 
tary, siRNA or shRNA is designed to contain at least a 1-nucleotide bulge 
(where the bases on opposing strands are noncomplementary) near the 
center of the molecule (typically around 21 bp). (3) Since siRNA or shRNA 
can exert a toxic effect when it contains the sequence 5'-UGGC-3', this 
sequence should be avoided. (4) Blunt-ended 27-bp RNA duplexes or 29-bp 
shRNAs with 2 nucleotides overhanging at the 3' end are much more potent 
inducers of RNAi than 21-mer siRNAs. The greater level of effectiveness of 
the slightly longer RNAs may reflect the fact that they are first bound and 
cleaved by Dicer, which facilitates their entry into the RISC. Using these 
slightly longer RNAs at low concentrations should avoid side reactions 
associated with 21-mer siRNAs. 

Unlike siRNAs, shRNAs are expressed in vivo as part of a genetic con¬ 
struct that includes a promoter sequence. This means that shRNAs need to 
be introduced by using strategies different than those used with siRNAs. 
Thus, shRNAs are typically delivered to their target cells by using viral vec¬ 
tors. Viral vectors that integrate into the chromosomal DNA are generally 
used when persistent long-term knockdown of gene expression is desired; 
the most popular choice is lentiviruses. However, with the use of all virus- 
based vectors, there are serious safety concerns that need to be addressed, 
and the type of promoter that is used also needs to be optimized. 

Applications 

Interfering RNAs have already found widespread use as tools in research 
that is directed toward understanding how gene expression is regulated in 
natural systems. One company that specializes in producing RNAi directed 
against human mRNAs advertises, "For each target (human) gene, we pro¬ 
vide four plasmids each with a different short hairpin RNAi sequence 
(shRNA). Our experimentally verified design algorithm minimizes the risk 
of off target effects and ensures the maximum knock-down. At least one of 
the four shRNA plasmids will reduce the target mRNA levels in the trans¬ 
fected cells by >70%." With the ready availability of human shRNA 
libraries, new insights and understanding of many fundamental and dis¬ 
ease processes should be rapidly forthcoming. It is hoped that this in turn 
will lead to a variety of new therapeutic agents and approaches. 

In addition to the many reports of successful modification of the gene 
expression of cells in culture with RNAi, there are an increasing number of 
reports of the in vivo effectiveness of RNAi. By 2008, siRNAs against a 
wide range of proteins, viruses, and diseases had been successfully 
expressed in mice, including siRNAs directed against herpes simplex virus 
type 2, hepatitis B virus, hepatitis C virus, Huntington disease, metastatic 
Ewing sarcoma (a form of cancer), respiratory syncytial virus, hepatic 
cancer, transforming growth factor receptor 2, severe acute respiratory syn¬ 
drome, heme oxygenase 1, keratinocyte-derived chemokine, tumor necrosis 
factor alpha, and human epidermal growth factor receptor 2. In addition, 
in 2008, three different RNAi therapeutics were being tested in clinical 
trials. All three of these therapeutics target VEGF or its receptor, a cause of 
age-related macular degeneration (see "Aptamers" above), and at that 
time, all had successfully completed either phase I or phase II clinical 
trials. 
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FIGURE 11.14 (A) DNA construct of a single-chain antibody against the hemorrhagic 
septicemia virus G protein. P CMV/ cytomegalovirus promoter; TGF, rainbow trout 
transforming growth factor beta signal peptide; V H , variable domain from mouse 
hybridoma heavy chain; spacer, 42 nucleotides encoding a 14-amino-acid spacer 
connecting V H and V L ; V L , variable domain from mouse hybridoma light chain; C L , 
constant part of the human light chain gene. (B) Schematic representation of the 
single-chain antibody against the hemorrhagic septicemia virus G protein. The 
curved black line represents the 14-amino-acid spacer. 


Antibody Genes 

A major problem with aquaculture is the loss of fish due to bacterial and 
viral infections. The chemical treatments that protect fish against these dis¬ 
ease-causing pathogens are extremely costly and not particularly effective. 
One way to address this problem is to develop fish that synthesize protec¬ 
tive antibodies against particular pathogens. The impetus for this approach 
came from the observation that rainbow trout could be protected against 
hemorrhagic septicemia virus by passive immunization through the injec¬ 
tion of a monoclonal antibody against the G protein from the virus. 
Subsequently, a gene encoding a single-chain antibody directed against the 
hemorrhagic septicemia virus G protein was synthesized. The synthetic 
gene encoded variable regions from the mouse L and H chains of an anti-G 
protein monoclonal antibody with a human antibody constant domain 
fused to the 3' end of the construct (Fig. 11.14). To mediate secretion in fish 
cells, the DNA was fused to the gene sequence encoding the signal sequence 
of rainbow trout transforming growth factor beta. The DNA construct was 
inserted into a eukaryotic expression vector under the transcriptional con¬ 
trol of a constitutive cytomegalovirus promoter. The cloned gene-vector 
construct was injected into the circulatory system of rainbow trout. When 
the fish were challenged with the hemorrhagic septicemia virus 11 days 
after injection of the DNA construct, nearly all of them survived, whereas 
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FIGURE 11.15 Survival of fish exposed to hemorrhagic septicemia virus 11 days after 
being injected with DNA encoding an antibody directed against the hemorrhagic 
septicemia virus G protein (+DNA) compared with control fish that did not receive 
this DNA construct (-DNA). 


the control fish did not (Fig. 11.15). Thirty-nine days after the DNA injection, 
the plasma concentration of the single-chain antibody remained at a high 
level. Thus, instead of administering a purified antibody to provide an 
animal with passive immunity, it is possible to inject an animal with DNA 
encoding the antibody. Giving an animal the ability to synthesize a specific 
antibody may be more efficient than depending upon the animal's immune 
system to produce a similar antibody. Flowever, since it may not be practical 
to inject large numbers of fish, the next step will be to create transgenic fish 
with antibody genes that confer passive immunity to various diseases. 


Nucleic Acid Delivery 

The ultimate effectiveness of any therapeutic agent depends upon the 
ability to deliver that agent to the tissues where it is required. Systemic 
introduction of a therapeutic agent often leads to the accumulation of very 
high levels in tissues where the agent is not required and sometimes results 
in serious side effects. To this end, viral vectors that deliver small nucleic 
acids to specific cellular targets have been developed. However, although 
virus-based gene delivery has been successful, a number of safety concerns 
have arisen in regard to the use of these vectors. Several approaches have 
recently been developed as an alternative to virus-based systems, as well 
as to the systemic introduction of target nucleic acids. There are several 
methods that have been used to deliver relatively small nucleic acids to 
animal cells. They include (1) intravenous injection; (2) local injection at the 
site of the pathology; (3) packaging into cationic liposomes (Fig. 11.5); (4) 
physical methods, like electroporation, sonoporation, or hydrodynamic 
pressure; and (5) a number of systems in which the nucleic acid is chemi¬ 
cally conjugated to another molecule. 

Human Gene Therapy 

Most of the experience of managing genetic diseases has been gathered 
from inherited inborn errors of metabolism and, to a lesser extent, from 
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disorders with defective structural proteins. The strategies for treating 
genetic disorders include restrictive or supplemented diets, inhibition of 
enzyme reactions to prevent the accumulation of toxic molecules, removal 
of toxic molecules, replacement of defective or absent proteins, restoration 
of protein activity, selective protein removal, organ and bone marrow 
transplantation, and nucleic acid-based therapies (Table 11.3). 

With the development of recombinant DNA technology, large num¬ 
bers of enzymes and structural proteins are now available for therapeutic 
use. Infusions of (3-glucosidase, (3-galactosidase, a-L-iduronidase, and 
adenosine deaminase (ADA) in clinical trials have significantly reduced 
the adverse effects of Gaucher disease, Fabry disease, mucopolysacchari¬ 
dosis I, and severe combined immunodeficiency disorder (SCID), respec¬ 
tively. This type of treatment, called enzyme replacement therapy, works 
well when either the enzyme or the structural protein (protein replace¬ 
ment therapy) is delivered to its biological site of action through the 
bloodstream. 

Since the 1940s, when it was discovered that a gene from one strain of 
bacteria could be transferred to and expressed in another strain, researchers 
have contemplated the possibility that human genetic diseases might be 
cured in an analogous manner. Introduction of a normal gene into a cell 
with a defective gene ought to correct the disorder because the transferred 


TABLE 11.3 Strategies for treating genetic disorders 
Specially formulated diets 

Restrictive diet to lower the intracellular level of a toxic 
molecule 

Supplemented diet to replace a metabolic deficiency 
Inhibition of enzyme reactions 

Enzyme inhibitor to prevent the accumulation of a toxic 
molecule by blocking a step in a metabolic pathway 
that precedes the reaction with a defective enzyme 
Removal of toxic molecules 
Dialysis 

Removal of excess cations (chelation) 

Facilitation of excretion by binding a toxic molecule to a 
low-molecular-weight compound 
Replacement of defective or missing product 
Enzyme replacement therapy 
Protein replacement therapy 
Cofactor supplementation 
Alteration of defective protein by small molecules 
Restoration of partial protein function 
Directed proteolytic degradation of defective protein 
Transplantation 

Replacement of a nonfunctional organ transplantation 
Providing a required protein synthesized by blood cells 
(bone marrow transplantation) 

Gene therapy 

Rectification of a genetic defect with a functional gene 
Nucleic acid therapy 

Blocking translation of mRNA from a mutant gene with 
an oligonucleotide (antisense, ribozyme) 

Correction of a gene mutation with an oligonucleotide 
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gene provides the required gene product. In theory, gene therapy should 
provide a persistent in vivo treatment, either in the tissue that is primarily 
affected by a gene mutation or in deficient cells that acquire a recombinant 
protein from distant cells that release the therapeutic protein into the circu¬ 
latory system. Although gene therapy was intended as a cure for genetic 
disorders, gene products can also be used to treat cancers, infections, and 
various degenerative disorders. 

In 1990, after exhaustive reviews by many different regulating panels 
in the United States, the first human gene therapy trial was initiated. Two 
young girls with ADA-deficient SCID received large doses of their own 
cells that had been engineered to carry a functional ADA gene. In both 
instances, the adverse symptoms were alleviated, indicating that this form 
of therapy is feasible. One of the patients has been free of SCID for more 
than 10 years, although she was regularly administered polyethylene 
glycol-ADA. The second ADA-deficient patient from the original trial and 
others from additional experiments have not had long-lasting cures. After 
the initial trials with gene therapy for ADA, a number of gene-based clin¬ 
ical protocols for various conditions were conducted. Unfortunately, these 
trials failed to establish the effectiveness of any of the treatments. 
Notwithstanding this lack of success, the trials provided a great deal of 
information about the methods of gene delivery, the duration of gene 
expression, and other technical features of gene therapy. Generally, despite 
the failure to correct a genetic disorder with an exogenous functional gene, 
this type of clinical research was considered safe. However, in September 
1999, the attitude toward gene therapy dramatically changed. Jesse 
Gelsinger, a healthy 18-year-old with ornithine transcarbamylase defi¬ 
ciency, was given a large dose of a virus carrying the OTC gene as part of a 
clinical trial. Tragically, he died 4 days later of a massive immune response. 
In another trial, injection of a gene into the heart muscle of a patient with 
severe coronary artery disease was fatal. As a consequence, although these 
disastrous outcomes were not predictable, the rigorous requirements for 
human gene therapy experiments became even more stringent. Researchers, 
in addition, became disinclined to initiate new trials. However, in 2000, two 
infants with an X-linked form of SCID (SCID-X1) were successfully treated 
with the gene encoding the subunit (yc) that is part of various cytokine 
receptors. These patients were free of symptoms for 10 months and are 
being monitored to determine if the correction is permanent; of the 11 
patients, 4 developed cancer. Also, hemophilic patients expressed an input 
gene encoding the blood coagulation factor IX for long periods and, impor¬ 
tantly, enough of the protein was produced to reduce the extent of the 
condition from severe to mild. 

Although in the broadest sense, the concept of human somatic cell gene 
therapy is straightforward, there are a number of critical biological consid¬ 
erations. For example, how will the cells that are to be targeted for correc¬ 
tion be accessed? How will the therapeutic (remedial) gene be delivered? 
What proportion of the target cells must acquire the input gene to coun¬ 
teract the disease? Does transcription of the input gene need to be precisely 
regulated to be effective? Will overexpression of the input gene cause alter¬ 
native physiological problems? Will the cells with the input gene be main¬ 
tained indefinitely, or will repeated treatments be required? 

Much of the research effort in human gene therapy has been directed 
to developing efficient and nonimmunogenic systems that deliver a thera¬ 
peutic gene to a specified cell type. Both viral and nonviral strategies have 
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been examined in detail. Different types of viruses, including retroviruses, 
adenovirus, adeno-associated virus, herpes simplex virus, and vaccinia 
virus, have been engineered as gene transfer vectors. A virus-mediated 
gene delivery system uses the cell receptor recognition system of the virus 
for binding to a specific cell type, which is followed by the internalization 
of the vector DNA and its transport to the cell nucleus. Some viruses have 
mechanisms for integration of the vector DNA into a chromosome site, and 
with others, the input DNA is maintained as an extrachromosomal ele¬ 
ment. Since high efficiency of vector DNA transfer is important, the admin¬ 
istered sample of viruses should contain mostly vector viruses with very 
few, and preferably no, infectious virus particles. To meet this goal, pack¬ 
aging cell lines were constructed for some virus vectors. These cultured 
cells carry genes that express viral proteins that are necessary for the for¬ 
mation of virus particles but are not capable of producing replication- 
competent (infectious) viruses. After transfection of a packaging cell line 
with a vector construct that is equivalent to the length of the wild-type 
virus genome and that carries the appropriate DNA sequence for pack¬ 
aging into a virus particle, the input DNA is replicated, assembled into 
viruses, and released into the cell medium. The noninfectious vector 
viruses are concentrated and prepared for use. Packaging cell lines have 
been devised for retrovirus vectors (Fig. 11.16) and other viral delivery 
systems. In some cases, disarmed viral and vector DNAs are cotransfected 
and only viruses with vector DNA are produced. In other instances, proto¬ 
cols allow the formation of both vector and infectious viruses with the 
separation of the two types of viruses before use. There are advantages and 
disadvantages to each of the major viral gene delivery systems. Some 
vector viruses transduce DNA at high efficiencies but the size of the insert 
is limited. Retrovirus vectors infect only dividing cells, which, without 
genetic modification, makes them ineligible for treating disorders of nondi¬ 
viding cells. Some viruses lack cell specificity. However, this particular 
shortcoming has been countered by designing viruses with cell-specific 
receptor-binding sites. Many vector viruses are immunogenic, which nulli¬ 
fies repeated treatment with the same viral strain. To overcome this 
problem, vector viruses with different antigenic determinants may be used 
for successive treatments. Currently, third- and fourth-generation varieties 
of vector viruses are being developed with distinctive features for specific 
illnesses. 

Some successful gene therapy experiments using viral vectors have 
been reported. In phase I clinical trials, a small number of patients have been 
treated with a recombinant adeno-associated virus vector carrying a human 
cDNA encoding retinal-pigment epithelium-specific 65-kDa protein (RPE65). 
The protein encoded by the RPE65 gene is an important part of the visual 
cycle, forming part of a pathway that regenerates the visual pigment after 
exposure to light. Individuals who lack this protein become deficient in 
11-C2S retinal, and their rod photoreceptors are unable to respond to light, 
eventually leading to blindness. Prior to undertaking these experiments, a 
strain of Baird dogs with a spontaneous defect in the RPE65 gene was suc¬ 
cessfully treated, with a restoration of visual function in the treated dogs. In 
all instances, the recombinant viral vector carrying the restorative gene was 
introduced by surgery of the eye into the subretinal space. Although the 
phase I human experiments were aimed only at determining the safety of 
this treatment, an improvement in the vision of four of the six patients was 
observed. This encouraging result notwithstanding, a longer follow-up 
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FIGURE 11.16 Production of packaged 
retrovirus vector RNA. The packaging 
cell line has two separate retroviral 
gene regions on its chromosomes; one 
contains the gag gene, and the other 
contains the pol and env genes. In each 
of these inserts, transcription is driven 
by sequences within the 5' long ter¬ 
minal repeat (5'-LTR) region. Both virus 
DNA segments lack the encapsidation 
sequence (\|/ + ) that is required for pack¬ 
aging a retroviral genome into a viral 
capsid. The packaging cell line synthe¬ 
sizes viral proteins, but because there is 
no encapsidation sequence within 
either of the retroviral mRNAs, empty 
viral capsids are produced. The viral 
proteins continue to be synthesized 
after the transfection of a packaging 
cell line with a full-length retroviral 
vector carrying a remedial (therapeutic) 
gene (Gene X) and a selectable marker 
gene (Neo r ). The full-length RNAs from 
the retrovirus vector sequence are rep¬ 
licated, and because they have an 
encapsidation region (\|/ + ), they are 
packaged into viral capsids. The 
released viral particles are replication 
defective because they do not have a 
pol gene. 
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period and clinical trials with many more subjects are required before this 
procedure can gain FDA approval. However, on the positive side, if the 
procedure is used to treat younger patients with less severe disease, greater 
improvements of visual function might be expected. 

Notwithstanding the advantages of viral vectors, they are often immu¬ 
nogenic, costly to maintain, and difficult to produce on a large scale 
without high-level expertise. Consequently, various nonviral gene transfer 
systems have been devised. The least complicated nonviral gene delivery 
system is the introduction of pure (naked) DNA constructs directly into the 
cells of a target tissue. When plasmid DNA was injected into mouse skel¬ 
etal muscle, some of the cells took up the DNA and a reporter gene was 
expressed for more than 50 days. However, this approach is limited to 
accessible tissues and requires large amounts of DNA. Pure DNA con¬ 
structs that cover the surfaces of 1- to 3-gm-diameter gold particles have 
been propelled with a gene gun into skin cells (see chapter 18) and into 
subcutaneous tumor cells. Therapeutic genes delivered in this way were 
expressed in the targeted tissues. Surrounding a DNA construct with arti¬ 
ficial lipid layers that form a lipid sphere with an aqueous core (liposome) 
facilitates the passage of the DNA through a cell membrane. 

To avoid degradation of introduced DNA, DNA-molecular conjugates 
have been developed. With this approach, poly-L-lysine is chemically 
linked (conjugated) to a molecule that binds to a specific cell receptor. Next, 
DNA is added and combines with the poly-L-lysine to form a tightly com¬ 
pacted, twisted, solid ring. With the cell receptor-binding sites arrayed on 
the outside of the DNA-molecular conjugate, the complexes bound exclu¬ 
sively to the specified cells, but the frequency of transfection was low. To 
remedy this problem, in addition to a cell receptor-binding amino acid 
sequence, other short amino acid sequences that facilitate cell membrane 
fusion and internalization of the DNA-molecular conjugate and both pro¬ 
tect the DNA from degradation and direct it to the cell nucleus have been 
combined into a single polypeptide. The addition of such a multifunctional 
protein to a DNA-molecular conjugate (Fig. 11.17) could enhance the effi¬ 
ciency of transfection. The current nonviral gene delivery systems have 
two major limitations: (1) the frequency of transfection is often too low to 
create a therapeutic effect and (2) the duration of therapeutic gene expres¬ 
sion is too brief to provide an effective treatment. 

A human artificial chromosome (HAC) would be an exciting thera¬ 
peutic vector. The DNA-carrying capacity would be very large, which 
would allow the inclusion of several genes, each with a complete set of 
regulatory elements. This type of vector should have long-term stability 
and sustained expression of a therapeutic gene(s) within either a prolifer¬ 
ating or a quiescent target cell. HACs (also called human engineered chro¬ 
mosomes) have been created in two ways (Fig. 11.18). First, HACs were 
assembled by ligation of individual chromosome components, including 
the chromosome ends (telomeres), centromere, and origins of replication. 
Telomere and centromere sequences were mixed with high-molecular- 
weight human DNA that had both origins of replication and a selectable 
gene marker in the presence of ligase. Cells were transfected with the DNA 
from the ligation mixture, and those with HACs were selected and main¬ 
tained. A second method of forming a HAC entails paring down an existing 
human chromosome by deleting material from within each chromosome 
arm to form a "minichromosome." HACs have been formed that range in 
size from 0.7 to 400 megabases. However, before HACs are used for gene 
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FIGURE 11.17 Schematic representation of a DNA-molecular conjugate delivery 
system. Short peptide sequences (motifs) that facilitate cell-specific binding (red), 
fusion with the cell membrane and internalization (yellow), protection of the DNA 
by preventing it from being routed to a lysosome (orange), and entry into the 
nucleus (dark blue) are attached to poly-L-lysine (light blue). The poly-L-lysine com¬ 
ponent combines with the DNA containing a therapeutic gene to form a condensed 
DNA-polylysine ring (medium blue) with the protein motifs facing outward. 


therapy, certain issues must be addressed. For example, will FIACs be effi¬ 
ciently introduced into the nuclei of target cells? Will effective levels of 
therapeutic gene expression be maintained for extended periods of time? 

Before therapeutic genes are introduced in human beings, the efficacy 
of using a particular gene along with a specific delivery system is tested 
on small animals, typically mice. This is intended to ensure not only that 
the added gene relieves a particular ailment, but also that there are no 
unexpected side effects that occur as a consequence of the treatment. 
Recently, researchers reported the administration of a gene to dystrophic 
and normal mice that helped them to increase both muscle biomass and 
strength. The growth factor myostatin plays a critical role in regulating 
skeletal muscle mass. It negatively regulates both the number of myofibers 
formed in development and the postnatal growth of muscles. It was previ¬ 
ously suggested that a number of neuromuscular disorders, including 
muscular dystrophies and age-related muscle disorders, might be "treated" 
using gene therapy approaches that prevent or lessen the inhibition of 
muscle growth by myostatin (Fig. 11.19). This could be achieved by the 
knockout of myostatin gene expression or by the overexpression of 
insulin-like growth factor 1, which can increase muscle size and strength. 
To this end, transgenic mice were created, from both dystrophic and 
normal mice, by a single postnatal intramuscular injection of adeno-asso- 
ciated virus that resulted in the overexpression of a gene encoding the 
myostatin inhibitor protein follistatin (Fig. 11.19). This single treatment 
enhanced muscle mass and strength in normal and dystrophic mice for 
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FIGURE 11.18 Construction of human artificial chromosomes (HACs). (A) Formation 
of a HAC in vitro by ligation. Chromosome elements (telomere [red] and cen¬ 
tromere [blue]) and high-molecular-weight human DNA (green) that has origins of 
replication and a selectable marker gene (green rectangle) are placed in a ligation 
reaction mixture. After ligation, the DNA is introduced into cells in culture, and the 
selectable marker system is used to identify the cells with a functional HAC. The 
ligation products that do not form a complete chromosome are not maintained 
through successive cell divisions. (B) Formation of a HAC by truncation of a chro¬ 
mosome in situ. The square brackets denote DNA sections that are removed from 
the chromosome arms to produce a truncated chromosome (minichromosome). 
Chromosome DNA can be deleted by radiation of human cells and recovered as a 
minichromosome in a cell hybrid after fusion of rodent and irradiated human 
cells. 


more than 2 years. This therapeutic strategy warrants serious consider¬ 
ation for clinical trials in the treatment of human muscle diseases. One 
potential concern, other than the safety of the viral vector, is that gene 
therapy approaches that are intended to treat muscle diseases by increasing 
muscle mass and strength might also be used for "gene doping" of healthy 
individuals to enhance athletic performance. 

Targeting Systems 

Lipids. While the effectiveness of siRNAs for specifically inhibiting gene 
expression in cultured cells has been demonstrated on numerous occa¬ 
sions, it is difficult to efficiently deliver these RNAs to tissues in vivo. One 
approach to overcome this difficulty has been to chemically couple an 
siRNA (at the terminal hydroxyl group of the sense strand RNA) to choles¬ 
terol (Fig. 11.20). The siRNA in question is complementary to an mRNA 
that encodes apolipoprotein B, a molecule involved in the metabolism of 
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FIGURE 11.19 Schematic representation 
of the regulation of muscle growth and 
development by myostatin in nontrans- 
genic mice (A) and in transgenic mice 
(B) that overexpress the protein fol- 
listatin. 


cholesterol. When the siRNA-cholesterol complex is intravenously injected 
into mice, it is taken up by the liver, jejunum (part of the small intestine), 
heart, kidney, lungs, and fat tissue cells. Once this complex is inside the 
tissue, the sense strand is destroyed and the antisense strand binds to the 
target mRNA. 

With this approach, the level of apolipoprotein B was reduced by more 
than 50% in the liver and by 70% in the jejunum. This resulted in a signifi¬ 
cant decrease in the plasma apolipoprotein B level, as well as the total 
amount of cholesterol. This strategy is an important first step in the devel¬ 
opment of a method to therapeutically lower cholesterol levels in 
humans. 

A number of other molecules, including some long-chain fatty acids 
and bile acids, may be used in place of cholesterol to mediate the uptake of 
siRNAs into cells. A critical factor in mediating the interaction between 
fatty acid-conjugated siRNAs and lipoprotein particles is the length of the 
fatty acid alkyl chain. Thus, docosanyl (C 22 ) and stearoyl (C lg ) conjugates 
bind more tightly to high-density lipoprotein and subsequently silence 
gene expression more effectively in vivo than lauroyl (C 12 ) and myristoyl 
(C 14 ) conjugates. Studies are under way to improve the delivery of lipid- 
conjugated siRNAs to treat a wide range of diseases. 

Bacteria. Bacteria that are normally found in association with various 
mammalian tissues and cells may be genetically engineered to produce 
therapeutic shRNAs. The engineered bacteria may then be used as vectors 
to deliver the therapeutic agent directly to the affected tissues. For example, 
a nonpathogenic strain of Escherichia coli was transformed with the plasmid 
vector TRIP containing the gene for the protein invasin, which permits 


FIGURE 11.20 A conjugate of cholesterol and an siRNA in which the cholesterol is 
coupled through the 5'-OH of the sense strand of the siRNA. The cholesterol facili¬ 
tates uptake of the siRNA into specific tissues. The antisense strand becomes part 
of the RISC and specifies where the mRNA is to be cleaved. 


siRNA 



Strand selection 


AAAAAAA 


mRNA 

\ __ 

/ \ 

\E 

i i i A.~nr-i i i i. 


RISC 


Argonaute^^^ 






















Nucleic Acids as Therapeutic Agents 453 


£. coli to enter (31-integrin-positive mammalian cells, and the gene HlyA, 
which encodes listeriolysin O, a protein that enables genetic material to 
escape from entry vesicles (Fig. 11.21). In addition, the TRIP vector carries 
an shRNA molecule under the control of a bacterial promoter directed 
against the mRNA produced by the cancer gene CTNNB1. As long as a 
bacterium is able to enter target mammalian cells and release shRNAs, the 
shRNA may be directed against any specific mRNAs. The E. coli cells act as 
a vector to transport the shRNAs to where they are required, e.g., cancer 
cells. This approach has been shown to work both for cancer cells in culture 
and with mice. With whole animals, the bacteria can be administered 
orally. 

Collagen. The protein polymer collagen, isolated from calf dermis, can be 
digested, under acidic conditions, by the proteolytic enzyme pepsin to 
form subunits of approximately 300 kDa each. These rod-like proteins 
(approximately 300 nm in length by 1.5 nm in diameter) are positively 
charged and therefore readily interact with and bind to negatively charged 
siRNAs (Fig. 11.22). These "atelocollagen" particles protect siRNAs from 
nuclease digestion and also can be injected locally for tissue-targeting 
delivery of the siRNAs. For example, siRNA-atelocollagen complexes have 
been efficiently delivered to tumor cells in mice and, after injection, can 
exist in an intact form for at least 3 days. Furthermore, in mice, siRNA- 
atelocollagen complexes have been found to inhibit tumor growth in bone 
cells. This method of packaging siRNA promises to be both reliable and 
safe, depending on the tissue involved. 


FIGURE 11.21 Use of a nonpathogenic strain of E. coli to deliver siRNAs to certain tis¬ 
sues. The bacterium was engineered to produce the protein invasin, which permits 
E. coli to enter (31-integrin-positive mammalian cells, as well as the gene HlyA, 
encoding listeriolysin O, which permits the shRNAs synthesized by the bacterium 
to be released inside the mammalian cell. 
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FIGURE 11.22 Negatively charged siRNAs bind to positively charged atelocollagen. 
The complex greatly facilitates the delivery of siRNAs to specific tissues. 


Antibodies. It has become relatively straightforward to generate mono¬ 
clonal antibodies against nearly any target protein; to humanize those 
antibodies, or their variable regions; and then to produce them in heterolo¬ 
gous host cells. Moreover, at the DNA level, it is easy to fuse the antibody 
gene with the gene for another protein. With this in mind, the gene 
encoding single-chain Fab fragments that bind specifically to a protein 
called ErbB2, which is found on the surfaces of breast cancer cells, was 
fused to the gene for the positively charged nucleic acid-binding protein 
protamine. The fusion protein binds to the surfaces of cells expressing 
ErbB2 and at its C terminus carries the 51-amino-acid-long protamine, 
which readily binds to added siRNAs (Fig. 11.23). In one test of this system. 


FIGURE 11.23 A single-chain Fab fragment directed against a mammalian cell surface 
protein is fused to the positively charged polypeptide protamine, which binds non- 
covalently to negatively charged siRNAs. The Fab fragment acts to deliver the 
siRNA to specific cells. Note that a conventional (two-chain) Fab fragment has also 
been used to deliver siRNAs. 
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FIGURE 11.24 Schematic representation of the secondary structure of a chimeric RNA 
molecule consisting of an aptamer and an siRNA. The portion of the aptamer that 
binds to the target protein and the siRNA portion of the molecule are shaded. 


an anti-human immunodeficiency virus (HIV) envelope Fab and an siRNA 
that is designed to cleave the HIV gag mRNA were employed. Using cells 
in culture, it was possible to reduce the amount of secreted Gag protein (the 
protein of the nucleocapsid shell around the RNA of a retrovirus) by >70%. 
This system also works in vivo when the construct is injected either intra¬ 
venously or directly into tumors. The hope is that by using a combination 
of specific antibodies (or antibody fragments) that direct siRNAs only to 
certain cells, and siRNAs that selectively cleave specific target mRNAs, this 
system can be used to treat a wide range of diseases. 

Aptamers. The binding specificity to a target antigen that is a central fea¬ 
ture of the functioning of antibodies is also a property of aptamers. Thus, 
conjugating aptamers, which bind to specific cell surface proteins, to 
siRNAs that are designed to reduce the expression of certain mRNAs 
should provide another method of targeting siRNAs to specific tissues or 
cells. Also, since both aptamers and siRNAs are chemically synthesized 
RNA oligonucleotides, it should be simple and straightforward to synthe¬ 
size chimeric RNA molecules that include both the binding specificity of an 
aptamer and an siRNA that targets a specific mRNA (Fig. 11.24). For 
example, an aptamer that binds selectively to a prostate-specific membrane 
antigen (found on prostate cancer cells) was first selected. Then, a 21-bp 
siRNA directed against mRNAs encoded by either of two genes that are 
necessary for prostate cells to survive was added to the aptamer sequence. 
Both activities (i.e., aptamer binding and RNAi) were maintained in the 
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chimeric molecule. The targeting aptamer did not impair the ability of the 
siRNA to silence the target gene, and the presence of the siRNA did not 
affect the ability of the aptamer to bind to its target. This simple but highly 
effective approach should be amenable to treating a wide range of human 
diseases provided that (1) silencing specific genes in a population produces 
therapeutic benefits and (2) there are surface receptors (usually proteins) 
that distinguish the target cell population and allow the siRNA to be inter¬ 
nalized by the cell. 


SUMMARY 


A number of human disorders that result from the overpro¬ 
duction of a normal protein may be treated by using (1) 
nucleotide sequences that bind to a specific mRNA and pre¬ 
vent its translation, i.e., an antisense oligonucleotide; (2) RNA 
sequences that bind and cleave specific RNA molecules, i.e., 
ribozymes; (3) small RNA molecules, i.e., aptamers, that 
assume a highly organized secondary and tertiary structure 
and bind tightly to a wide range of molecules, including pro¬ 
teins, amino acids, and drugs; or (4) small double-stranded 
RNA molecules that direct the sequence-specific degradation 
of mRNA, i.e., interfering RNAs. These techniques may also 
be used to lessen or prevent diseases caused by pathogenic 
viruses and other disease-causing organisms. 

The greatest impediment to the development of nucleic 
acid-based therapeutic agents is the difficulty in delivering 
these agents to their target tissue(s). Initially, workers used 
virus-based delivery systems with some success, although 
some safety concerns exist in regard to the use of these vec¬ 
tors. Other approaches for the delivery of nucleic acid-based 
therapeutic agents include intravenous injection; local injec¬ 
tion at the site of the pathology; packaging the nucleic acid 
into cationic liposomes; physical methods, like electropora¬ 
tion, sonoporation, or hydrodynamic pressure; and conju¬ 
gating the nucleic acid to another molecule, such as a lipid 


molecule, cholesterol, collagen, an antibody fragment, or an 
aptamer. 

The development of effective treatments for genetic dis¬ 
eases has been elusive because, in many instances, the appro¬ 
priate gene product cannot be provided to a patient. However, 
when a normal version of a gene has been identified and 
cloned, it may be possible that either it or a cDNA derivative 
can be used to correct the defect in affected individuals. Viral 
and nonviral systems have been developed for the delivery of 
therapeutic genes. Viral vectors take advantage of the ability 
of a virus to penetrate a specific cell, protect the DNA from 
degradation, and direct it to the cell nucleus. A number of 
viruses have been engineered for gene therapy applications. 
Packaging cell lines for some viral systems ensure that virtu¬ 
ally no infectious viruses are present in a sample of vector 
viruses. Nonviral gene delivery systems include injection of 
pure DNA, bombardment of a target tissue with DNA-coated 
particles, and cellular uptake of DNA that is enclosed within a 
lipid envelope. In addition, HACs may find use as vectors for 
the long-term maintenance and expression of therapeutic 
genes in human cells. Generally, the major drawbacks of the 
current generation of gene therapy vector systems are immu- 
nogenicity, lack of cell specificity, inefficient gene transfer, and 
limited therapeutic gene expression. 
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REVIEW QUESTIONS 

1. How can antisense oligonucleotides be used to treat psori¬ 
asis? 

2. What are ribozymes, and how can they be used as human 
therapeutic agents? 

3. What are interfering RNAs, and how might they be used as 
human therapeutic agents? 

4. What is an aptamer, and how is it used as a therapeutic 
agent? 

5. How can antibody genes be used to confer passive immu¬ 
nity? 

6. How can interfering RNAs be delivered to specific cells? 


7. In developing a new nucleic acid-based therapeutic agent, 
how would you decide between antisense oligonucleotides, 
ribozymes, aptamers, and interfering RNA? 

8. What are the key attributes of a therapeutic gene delivery 
system for humans? 

9. How can the interferon response, which is usually induced 
by double-stranded RNA, be avoided when utilizing siRNAs 
as therapeutic agents? 

10. What are some of the advantages and disadvantages of 
deoxyribozymes compared to ribozymes? 

11. How can the progression of age-related macular degener¬ 
ation be limited using RNA therapeutics? 


Vaccines 



Subu 

Herpe 
Foot-el 
Cholera 
SARS. 

Staphylococcus aureus 
Human Papillomavirus 

Peptide Vaccines 

Foot-and-Mouth Disease 
Malaria 

Genetic Immunization: DNA 
Vaccines 

Delivery 
Dental Caries 

Attenuated Vaccines 

Cholera 

Salmonella Species 
Leishmania Species 
Herpes Simplex Virus 

Vector Vaccines 

Vaccines Directed against Viruses 
Vaccines Directed against Bacteria 
Bacteria as Antigen Delivery Systems 

SUMMARY 
REFERENCES 
REVIEW QUESTIONS 


V accination protects A recipient from pathogenic agents by estab¬ 
lishing an immunological resistance to infection. An injected or oral 
vaccine induces the host to generate antibodies against the disease- 
causing organism; therefore, during future exposures, the infectious agent 
is inactivated (neutralized, or killed), its proliferation is prevented, and the 
disease state is not established. 

Over 200 years ago, in 1796, Edward Jenner experimentally tested the 
folklore-based notion that human infection with a mild cattle disease called 
cowpox would protect infected individuals against the human disease 
smallpox. Smallpox is an extremely virulent disease with a high death rate. 
If one survives, permanent disfigurement, mental derangement, and blind¬ 
ness often follow. Jenner inoculated James Phipps, an 8-year-old boy, with 
exudate from a cowpox pustule. In two separate trials after the initial vac¬ 
cination, the boy was fully protected against human smallpox. This country 
doctor had discovered the principle of vaccination. 

Communicable diseases such as tuberculosis, smallpox, cholera, 
typhus, bubonic plague, and poliomyelitis, have in the past been a scourge 
for humankind. With the advent of vaccination, antibiotics, and effective 
public health measures, these epidemic diseases have, for the most part, 
been brought under control (Table 12.1). Occasionally, however, protective 
measures become ineffective, and devastating new outbreaks occur. In 
1991, a cholera epidemic struck Peru, producing, over the next 3 years, 
approximately 1 million infections and several thousand deaths. Also, for 
many current human and animal diseases, there are no vaccines. Today, 
more than 2 billion humans suffer from diseases that theoretically could be 
curtailed by vaccination. In addition, new diseases for which vaccines 
might be useful continue to emerge. 

In recent years, in some developed countries, a small but vocal minority 
of individuals have refused to have their children vaccinated. These indi¬ 
viduals argue that many of the previously common illnesses have been 
vanquished, and they fear the potential side effects of the vaccinations 
more than the disease itself. In addition, many of these people question 
modern medicine and instead prefer to rely upon so-called traditional or 
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TABLE 12.1 Annual cases in Canada from various diseases before and after the 


introduction of vaccines against the causative agents of the diseases 


Disease 

Annual no. of cases before 
vaccine was introduced 

No. of cases in 

2002 

Polio 

20,000 

0 

Diphtheria 

9,000 

0 

Rubella 

69,000 

16 

Mumps 

52,000 

197 

Haemophilus influenzae 
type b infection 

2,000 

48 

Whooping cough 

25,000 

2,557 

Measles 

300,000 

7 


natural therapies. In fact, the small number of individuals who are not vac¬ 
cinated benefit from the fact that the vast majority of other people in society 
have been immunized, thereby making it difficult for many diseases to 
spread through a community. However, in communities where vaccination 
levels decrease below a certain level, there is a real danger of some tradi¬ 
tional diseases making a comeback. 

Modern vaccines typically consist of either a killed (inactivated) or a 
live, nonvirulent (attenuated) form of an infectious agent. Traditionally, the 
infectious agent is grown in culture, purified, and either inactivated or 
attenuated without, of course, losing the ability to evoke an immune 
response that is effective against the virulent form of the infectious 
organism. Notwithstanding the considerable success that has been achieved 
in creating effective vaccines against diseases such as German measles, 
diphtheria, whooping cough, tetanus, smallpox, and poliomyelitis, there 
are a number of limitations to the current mode of vaccine production. 

• Not all infectious agents can be grown in culture, so no vaccines 
have been developed for a number of diseases. 

• Production of animal and human viruses requires animal cell cul¬ 
ture, which is expensive. 

• Both the yield and rate of production of animal and human viruses 
in culture are often quite low, making vaccine production costly. 

• Extensive safety precautions are necessary to ensure that laboratory 
and production personnel are not exposed to a pathogenic agent. 

• Batches of vaccine may not be killed or may be insufficiently attenu¬ 
ated during the production process, thereby introducing virulent 
organisms into the vaccine and inadvertently spreading the disease. 

• Attenuated strains may revert, a possibility that requires continual 
testing to ensure that the reacquisition of virulence has not 
occurred. 

• Not all diseases (e.g., acquired immune deficiency syndrome 
[AIDS]) are preventable through the use of traditional vaccines. 

• Most current vaccines have a limited shelf life and often require 
refrigeration to maintain potency. This requirement creates storage 
problems in countries with large, unelectrified rural areas. 

Within the last 2 decades, recombinant DNA technology has provided a 
means of creating a new generation of vaccines that overcome the drawbacks 
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of traditional vaccines. The availability of gene cloning has enabled research¬ 
ers to contemplate various novel strategies for vaccine development. 

• Virulence genes could be deleted from an infectious agent that 
retains the ability to stimulate an immunological response. In this 
case, the genetically engineered agent could be used as a live vaccine 
without concern about reversion to virulence, because it is impos¬ 
sible for a whole gene to be reacquired spontaneously during 
growth in pure culture. 

• Live nonpathogenic carrier systems that carry discrete antigenic 
determinants of an unrelated pathogenic agent could be created. In 
this form, the carrier system facilitates the induction of a strong 
immunological response directed against the pathogenic agent. 

• For infectious agents that cannot be maintained in culture, the genes 
for the proteins that have critical antigenic determinants can be iso¬ 
lated, cloned, and expressed in an alternative host system, such as 
Escherichia coli or a mammalian cell line. These cloned gene proteins 
can be formulated into a vaccine. 

• There are some infectious agents that do not damage host cells 
directly; instead, the disease condition results when the host immune 
system attacks its own (infected) cells. For these diseases, it may be 
possible to create a targeted cell-specific killing system. Although 
not a true vaccine, this type of system attacks only infected cells, 
thereby removing the source of the adverse immunological response. 
In these cases, the gene for a fusion protein is constructed. First, one 
part of this fusion protein binds to an infected cell. Then, the other 
part kills the infected cell. 

Because of less stringent regulatory requirements, the first vaccines 
that were produced by recombinant DNA techniques were for animal dis¬ 
eases, such as foot-and-mouth disease, rabies, and scours, a diarrheal dis¬ 
ease of pigs and cattle. In addition, many more animal vaccines are 
currently being developed. For human diseases, a large number of recom¬ 
binant vaccines are currently in various stages of development, including 
clinical trials (Table 12.2). 

Unfortunately, in comparison to the number of new therapeutic agents, 
very few recombinant-DNA-based vaccines have been developed. Why, 
according to the vaccine producers, does it take so long for new vaccines to 
come to the marketplace? First, while there were 25 major vaccine manufac¬ 
turers worldwide in 1970, in 2005 there were only 5. Second, vaccines are 
currently viewed as "almost a commodity," with little financial incentive to 
develop new vaccines; in 2005, the worldwide market for preventive vac¬ 
cines was approximately $8 billion. Third, the U.S. government is a major 
purchaser of vaccines, forcing discount prices and thereby decreasing the 
potential profit. Fourth, in 1980 in the United States, "good manufacturing 
practices" were introduced into vaccine production, causing manufacturing 
costs to increase dramatically. Fifth, the transition from conventional to 
newer processes for vaccine production is expensive and time-consuming 
(not including clinical trials), so that it is preferable to continue using a more 
established technology. On the positive side, the focus of the larger vaccine 
manufacturers on large-scale products has provided smaller biotechnology 
companies with a number of niche opportunities to develop and market 
new products. Finally, since most vaccines are intended to protect large 
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TABLE 12.2 Human disease agents for which recombinant vaccines are currently 
being developed 


Pathogenic agent 

Disease 

Viruses 

Varicella-zoster viruses 

Chicken pox 

Cytomegalovirus 

Infection in infants and immuno¬ 
compromised patients 

Dengue virus 

Hemorrhagic fever 

Hepatitis A virus 

High fever, liver damage 

Hepatitis B virus 

Long-term liver damage 

Herpes simplex virus type 2 

Genital ulcers 

Influenza A and B viruses 

Acute respiratory disease 

Japanese encephalitis 

Encephalitis 

Parainfluenza virus 

Inflammation of the upper respiratory 
tract 

Rabies virus 

Encephalitis 

Respiratory syncytial virus 

Upper and lower respiratory tract 
lesions 

Rotavirus 

Acute infantile gastroenteritis 

Yellow fever virus 

Lesions of heart, kidney, and liver 

Human immunodeficiency virus 

AIDS 

Bacteria 

Vibrio cholerae 

Cholera 

E. coli enterotoxin strains 

Diarrheal disease 

Neisseria gonorrhoeae 

Gonorrhea 

Haemophilus influenzae 

Meningitis, septicemic conditions 

Mycobacterium leprae 

Leprosy 

Neisseria meningitidis 

Meningitis 

Bordetella pertussis 

Whooping cough 

Shigella strains 

Dysentery 

Streptococcus group A 

Scarlet fever, rheumatic fever, throat 
infection 

Streptococcus group B 

Sepsis, urogenital tract infection 

Streptococcus pneumoniae 

Pneumonia, meningitis 

Clostridium tetani 

Tetanus 

Mycobacterium tuberculosis 

Tuberculosis 

Salmonella enterica serovar Typhi 

Typhoid fever 

Parasites 

Onchocerca volvulus 

River blindness 

Leishmania spp. 

Internal and external lesions 

Plasmodium spp. 

Malaria 

Schistosoma mansoni 

Schistosomiasis 

Trypanosoma spp. 

Sleeping sickness 

Wuchereria bancrofti 

Filariasis 


populations, the very large amounts of money that companies often charge 
to treat a single individual with some of the newer therapeutic agents (see 
chapter 10) are unrealistic for pricing of a new vaccine. In fact, it is precisely 
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in many poorer countries, where most individuals cannot afford to pay very 
much for treatment or immunization, that vaccines are needed the most. 


Subunit Vaccines 

Vaccines generally consist of either killed or attenuated forms of the whole 
pathogenic agent. The antibodies elicited by these vaccines initiate an 
immune response to inactivate (neutralize) pathogenic organisms by 
binding to proteins on the outer surface of the agent. So, do vaccines need 
to contain the whole organism, or will specific portions of pathogenic 
organisms suffice? For disease-causing viruses, it has been shown that 
purified outer surface viral proteins, either capsid or envelope proteins 
(Fig. 12.1), are often sufficient for eliciting neutralizing antibodies in the 
host organism. Vaccines that use components of a pathogenic organism 
rather than the whole organism are called "subunit" vaccines; recombinant 
DNA technology is very well suited for developing new subunit vaccines. 

There are advantages and disadvantages to the use of subunit vaccines. 
On the positive side, using a purified protein(s) as an immunogen ensures 
that the preparation is stable and safe, is precisely defined chemically, and 
is free of extraneous proteins and nucleic acids that can initiate undesirable 
side effects in the host organism. On the negative side, purification of a 
specific protein can be costly, and in certain instances, an isolated protein 
may not have the same conformation as it does in situ (within the viral 
capsid or envelope), with the result that its antigenicity is decreased. 
Obviously, the decision to produce a subunit vaccine depends on an assess¬ 
ment of several biological and economic factors. 

Herpes Simplex Virus 

Flerpes simplex virus (FISV) has been implicated as a cancer-causing (onco¬ 
genic) agent, in addition to its more common roles in causing sexually 
transmitted disease, severe eye infections, and encephalitis, so prevention 
of HSV infection by vaccination with either killed or attenuated virus may 

FIGURE 12.1 Schematic representation of an animal virus. Viruses generally consist of 
a relatively small nucleic acid genome (3 to 200 kb of either double- or single- 
stranded DNA or RNA) within a viral protein capsid that is sometimes, depending 
on the virus, surrounded by a protein-containing viral envelope (membrane). 
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FIGURE 12.2 Schematic representation of the development of a subunit vaccine 
against HSV. The isolated HSV gD protein gene is used to transfect CHO cells. 
Then, the transfected cells are grown in culture and produce gD protein. Mice 
inoculated with the purified gD protein are protected against infection by HSV. 


put the recipient at risk for cancer. Thus, protection against HSV would be 
best achieved by a subunit vaccine, which would not be oncogenic. 

The primary requirement for creating any subunit vaccine is identifica¬ 
tion of the component(s) of the infectious agent that elicits antibodies that 
react against the intact form of the infectious agent. The HSV type 1 (HSV-1) 
envelope glycoprotein D (gD) is such a component, because after injection 
into mice, it elicits antibodies that neutralize intact HSV. The HSV-1 gD 
gene was isolated and then cloned into a mammalian expression vector and 
expressed in Chinese hamster ovary (CHO) cells (Fig. 12.2), which, unlike 
the E. coli system, properly glycosylate foreign eukaryotic proteins. The 
complete sequence of the gD gene encodes a protein that becomes bound 
to the mammalian host cell membrane (Fig. 12.3A). However, a membrane- 
bound protein is much more difficult to purify than a soluble one. 
Consequently, the gD gene was modified by removing the nucleotides 
encoding the C-terminal transmembrane-binding domain (Fig. 12.3B). The 
modified gene was then transformed into CHO cells, where the product 
was glycosylated and secreted into the external medium (Fig. 12.2). In 
laboratory trials, the modified form of gD was effective against both HSV-1 
and HSV-2. 

Foot-and-Mouth Disease 

Foot-and-mouth disease virus (FMDV) has a devastating impact on cattle 
and swine and is extremely virulent, but for the most part, it has been pos¬ 
sible to keep the negative effects of the virus to a minimum by using form¬ 
alin-killed FMDV preparations as a vaccine. Approximately 1 billion doses 
of this killed-virus vaccine are used worldwide each year. The availability 
of the vaccine notwithstanding, in 2001, there was a major outbreak of foot- 
and-mouth disease in Europe in which tens of thousands of cattle were 
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FIGURE 12.3 (A) Location in the envelope of HSV-1 gD with the transmembrane 
domain. (B) Extracellular location of a soluble gD without the transmembrane 
domain. 

slaughtered and their carcasses were incinerated in an effort to prevent the 
vims from spreading. 

Research on FMDV found that the major antigenic determinant that 
induces neutralizing antibodies is capsid viral protein 1 (VP1). Although 
purified VP1 is a much less potent antigen than intact viral particles, it can 
still elicit neutralizing antibodies by itself and therefore can protect animals 
from infection by FMDV. Thus, the gene for VP1 became a target for 
cloning. 

The genome of FMDV is composed of single-stranded RNA (approxi¬ 
mately 8,000 nucleotides long). Therefore, for recombinant DNA manipula¬ 
tions, it was necessary first to synthesize a double-stranded complementary 
DNA (cDNA) of the entire genome (Fig. 12.4). This cDNA was then 
digested with restriction enzymes, and the fragments were cloned in an E. 
coli expression vector. The product of the VP1 coding sequence was identi¬ 
fied immunologically as part of a fusion protein under the control of the p' 
promoter-cl repressor system. The fusion protein was 396 amino acids long 
and consisted of a portion of a stabilizing carrier protein, i.e., the bacterio¬ 
phage MS2 replicase protein, as well as the entire coding sequence of the 
FMDV VP1 protein (Fig. 12.4). The fusion protein containing the VP1 pro¬ 
tein fragment was able to generate neutralizing antibodies to FMDV. 

A fusion protein, however, faces more government regulatory hurdles 
than intact VP1 would because of the potential immunogenic effects of the 
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FIGURE 12.4 Schematic representation of the development of a subunit vaccine 
against foot-and-mouth disease. The entire viral RNA is made into cDNA, which is 
then digested with restriction enzymes. The DNA fragments are cloned into an 
expression vector in frame with the gene for the E. coli bacteriophage MS2 replica¬ 
tive protein. The plasmid constructs are used to transform E. coli, and then the 
stable fusion protein is isolated and used to inoculate animals. 


non-VPl component. Therefore, the VP1 sequence alone will have to be 
subcloned onto a different expression vector. Nevertheless, a subunit vac¬ 
cine for foot-and-mouth disease could soon be ready for preclinical trials. 

Cholera 

The bacterium Vibrio cholerae, the causative agent of cholera, colonizes the 
small intestine and secretes large amounts of a hexameric enterotoxin, 
which is the actual pathogenic agent. This protein consists of one subunit, 
the A subunit, that has ADP ribosylation activity and stimulates adenylate 
cyclase, and five identical B subunits that bind specifically to an intestinal 
mucosal cell receptor (Fig. 12.5). The A subunit has two functional domains: 
the Aj peptide, which contains the toxic activity, and the A 2 peptide, which 
joins the A subunit to the B subunits. Until a few years ago, a traditional 
cholera vaccine consisting of phenol-killed V. cholerae was in common use. 
This vaccine generated only moderate protection, typically lasting from 
about 3 to 6 months. More recently, a vaccine (Dukoral) consisting of heat- 
inactivated V. cholerae Inaba classic strain, heat-inactivated Ogawa classic 
strain, formalin-inactivated Inaba El Tor strain, formalin-inactivated Ogawa 
classic strain, and a recombinant cholera toxin B subunit, has come into use. 
The vaccine is taken orally (two doses 1 week apart), and it is claimed that 
an additional booster immunization is not required for about 2 years. 


SARS 


FIGURE 12.5 Schematic representation of 
hexameric cholera toxin. The A peptide 
is shown in blue, and the B peptide is 
shown in green. 

B subunit A2 peptide At peptide 



In 2003, there were more or less simultaneous outbreaks in several major 
cities, including Hong Kong, Singapore, and Toronto, of a new, unknown 
disease. The first case of this disease, severe acute respiratory syndrome, or 
SARS, was reported in Guangdong Province, southern People's Republic of 
China, in November 2002. Given the enormous frequency of air travel, the 
disease rapidly spread to 29 countries on five continents. With the assis¬ 
tance of the World Health Organization, authorities in affected regions 
immediately implemented strict infection control procedures, so that by 
mid-July 2003, the outbreak was effectively contained. However, this was 
not before a total of 8,096 SARS cases and 774 associated deaths were 
reported. Within a very short time, scientists had identified a novel corona- 
virus as the causative agent of the disease. The SARS virus contains a 
single-stranded plus-sense RNA genome of approximately 30 kb. In nature, 
the viral spike protein, which is inserted into the viral membrane, binds to 
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FIGURE 12.6 Schematic representation of the binding of the SARS vims glycosylated 
spike protein to a cellular outer surface protein receptor. 

a receptor protein that is present on the surfaces of mammalian host cells 
(Fig. 12.6). Following the binding of the virus to the receptor, the viral and 
cell membranes can fuse, thereby facilitating the entry of the virus into the 
cell. The spike protein (or the external portion of the molecule) is an attrac¬ 
tive candidate for the development of a subunit vaccine. In practice, it was 
found that the external portion of the spike protein (i.e., amino acids 318 to 
510) could bind efficiently to the host cell receptor protein. Following the 
determination of the complete nucleotide sequence of the SARS virus in 
2003, it was relatively straightforward to express a codon-optimized ver¬ 
sion of this 192-amino-acid peptide in CFIO cells. In addition to encoding 
the 192-amino-acid spike peptide, the DNA construct introduced into the 
CFIO cells also included a mammalian secretion signal, an N-terminal 
(Staphylococcus aureus) protein A purification tag, and a tobacco etch virus 
protease cleavage site (Fig. 12.7). The recombinant protein synthesized in 
CFIO cells was secreted into the growth medium, purified by affinity chro¬ 
matography on a column containing immobilized immunoglobulin G, and 
then digested with tobacco etch virus protease to remove the protein A 
purification tag. Using this construct, the spike protein fragment was 
readily synthesized and purified. To date, the fully glycosylated form of 
this subunit vaccine candidate has been shown to elicit a strong immune 
response in mice. It still remains to be seen whether it can protect immu¬ 
nized animals against infection with the SARS virus. 

Staphylococcus aureus 

The gram-positive bacterium S. aureus is a major cause of hospital-acquired 
infection. This bacterium produces a pore-forming toxin; is a leading cause 
of infections of the bloodstream, lower respiratory tract, and skin; and, 
because of the emergence of antibiotic-resistant strains, is a serious public 
health threat. To address the challenge of treating S. aureus infections, 

FIGURE 12.7 The main features of a portion of the recombinant plasmid construct 
used to transfect CHO cells and to produce the fragment of the SARS spike protein 
that interacts with the host cell receptor. 
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whole-cell attenuated or killed vaccines have been developed. However, 
these vaccines have not been particularly effective. Similarly, subunit vac¬ 
cines composed of individual bacterial surface proteins generate immune 
responses that afford only partial protection when tested in experimental 
animals. However, a more effective subunit vaccine has recently been 
developed to protect individuals against S. aureus by combining several of 
the bacterium's antigens (Fig. 12.8). Starting with one disease-causing 
strain of S. aureus, 23 bacterial outer surface proteins were identified from 
genomic DNA sequence data. Then, the coding regions of these proteins, 
minus the signal sequences, were polymerase chain reaction (PCR) ampli¬ 
fied and cloned into plasmid vectors that enabled the proteins to be 
expressed in E. coli with a poly-His tag at the N terminus of the protein (to 
facilitate the purification of the overexpressed protein). The proteins were 
expressed and purified, and mice were separately immunized with each of 
the 23 purified proteins. The immunized mice were subsequently chal¬ 
lenged by injections of live disease-causing S. aureus. Many of the recombi¬ 
nant surface proteins generated an immune response that afforded partial 
protection against staphylococcal disease, with some proteins affording 
more protection than others. However, over the long term, immunization 
with individual surface proteins afforded only modest protection. A mix¬ 
ture of the four proteins that individually generated the most effective 
antibodies was used to immunize mice and was found to completely pro¬ 
tect against the pathogen. The experimental design ensured that only 
common, and not strain-specific, S. aureus surface proteins were used to 
immunize mice. Thus, it is not surprising that the tetravalent subunit vac¬ 
cine that was developed was effective against five different clinical isolates 
(strains) of S. aureus. This work represents an important first step in the 
development of an S. aureus vaccine. 

Human Papillomavirus 

Human papillomavirus is the causative agent of many common sexually 
transmitted diseases. While most of these infections are benign and often 
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FIGURE 12.9 Schematic representation of the virus-like particles assembled from 
cloned and overproduced LI proteins from the capsids of four different strains of 
human papillomavirus. These virus-like particles are the constituents of a commer¬ 
cial subunit vaccine against the virus. 


asymptomatic, persistent infection with some strains of human papilloma¬ 
virus is associated with the development of cervical and related cancers, as 
well as genital warts. Since human papillomavirus type 16 is associated 
with approximately 50% of cervical cancers, a vaccine that prevents human 
papillomavirus type 16 infection could significantly reduce the incidence of 
cervical cancer. Moreover, a vaccine that is directed against several dif¬ 
ferent types of human papillomaviruses could effectively prevent nearly all 
human papillomavirus-induced cervical cancers. To put this into perspec¬ 
tive, cervical cancer is the second most commonly diagnosed cancer among 
women worldwide, accounting for more than 250,000 deaths per year. 

In June 2006, the U.S. Food and Drug Administration approved a vac¬ 
cine that protects women against infection by human papillomavirus types 
6, 11, 16, and 18, the types most frequently associated with cervical cancer 
and genital warts. By the end of 2006, this vaccine had been approved for 
use in more than 50 countries worldwide. The vaccine, called Gardasil, is 
quadrivalent, i.e., it contains virus-like particles assembled from the major 
capsid (LI) proteins of the above-mentioned four types of human papillo¬ 
mavirus (Fig. 12.9 and Box 12.1). It was previously shown that the LI pro¬ 
tein can self-assemble into virus-like particles that resemble papillomavirus 
virions, and these particles are highly immunogenic, inducing neutralizing 
antibodies directed against the whole live virus. The gene for the LI protein 
from each of the four virus types was cloned and expressed in a recombi¬ 
nant Saccharomyces cerevisiae (yeast) strain. Following separate fermenta¬ 
tions of the four yeast strains, the viral capsid proteins assembled into 
virus-like particles (i.e., the viral capsid without any other viral proteins or 
the viral nucleic acid). These virus-like particles were then purified and 
combined to form the quadrivalent vaccine. 


Peptide Vaccines 

The question arises as to whether a small discrete portion (domain) of a 
protein can act as an effective subunit vaccine and induce the production 
of neutralizing antibodies. Intuitively, one would expect that only the por¬ 
tions, or domains, of a protein that are accessible to antibody binding, that 
is, those on the exterior surface of the virus, would be immunologically 
important and that those located in inaccessible regions inside the virus 
particle could be ignored if they do not contribute to the conformation of 
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BOX 12.1 


A Vaccine To Prevent 
Cervical Cancer 

O n 15 September 2007, the head¬ 
line on the front page of The Globe 
and Mail, a Toronto, Canada, news¬ 
paper, read, "Should your daughter 
get the needle?" The article that fol¬ 
lowed related how the Canadian fed¬ 
eral government, in conjunction with 
the Ontario provincial government, 
was funding a program that would 
offer free vaccinations against human 
papillomavirus to girls in grade 8 
(typically 12- and 13-year-olds). The 
vaccine, which had received approval 
in the United States a year earlier, pro¬ 
vides inoculated women with immu¬ 
nity against the viruses that are 
responsible for approximately 70% of 
all cervical cancers and 90% of genital 
warts. Grade 8 was chosen because, 
according to officials, it is before most 
girls become sexually active. The vac¬ 
cine, sold under the brand name 
Gardasil, is given by needle in three 
doses over 6 months and is approved 
for females between the ages of 9 and 


26. The rationale for giving the vac¬ 
cine at such a young age is related to 
the fact that once a woman has been 
exposed to the four strains of the virus 
for which the vaccine provides protec¬ 
tion, the vaccine will no longer be 
effective. The three doses of vaccine 
cost about $300 to $400, although 
those inoculated through this program 
received the vaccine free of charge. 
Boys can also get human papilloma¬ 
virus infections, but testing is still 
under way to determine whether the 
vaccine works as well for them. 

According to The Globe and Mail, 
"For many parents it's a no-brainer: 
anything that will protect their daugh¬ 
ters from cancer ... is worth the risks." 
However, at the same time, a small 
but vocal minority has expressed 
serious reservations about this pro¬ 
gram. On one hand, there are individ¬ 
uals who do not trust the medical 
establishment, the pharmaceutical 
companies, and/or the government. 
Others have expressed concerns that 
some girls will naively believe that 
this vaccine will protect them against 


any and all sexually transmitted dis¬ 
eases and use this as a rationale or 
excuse for becoming sexually active at 
an early age. Still others have ques¬ 
tioned the potential side effects from 
the vaccine, despite the fact that exten¬ 
sive clinical trials have shown that 
they are quite rare. Notwithstanding 
the concerns of some individuals, the 
vaccine was initially offered through 
school inoculation programs to young 
females in the Canadian provinces of 
Newfoundland and Labrador, Prince 
Edward Island, Nova Scotia, and 
Ontario. However, by September 2008, 
all of the other provinces in Canada 
had decided to implement this pro¬ 
gram. The real benefits of the program 
(hopefully an enormous reduction in 
cervical cancer) may not be known for 
several decades; in the meantime, the 
debate will continue. Also, since the 
vaccine does not protect against all 
strains of human papillomavirus, it is 
essential that women continue to get 
an annual Pap test. 


the immunogenic domain (Fig. 12.10). If this argument has validity, it is 
possible that short peptides that mimic epitopes (antigenic determinants) 
will be immunogenic and could be used as vaccines (peptide vaccines). 

However, there are certain limitations to using short peptides as vac¬ 
cines: 


FIGURE 12.10 Generalized envelope- 
bound protein with external epitopes 
(1 to 5) that might elicit an immune 
response. 


• To be effective, an epitope must consist of a short stretch of contig¬ 
uous amino acids, which does not always occur naturally. 

• The peptide must be able to assume the same conformation as the 
epitope in the intact viral particle. 

• A single epitope may not be sufficiently immunogenic. 



Membrane 


Foot-and-Mouth Disease 

Potential epitopes of the soluble antigenic FMDV VP1 were identified from 
the X-ray crystallographic structure, and chemically synthesized domains 
of the protein were tested as candidate peptide vaccines. Peptides corre¬ 
sponding to amino acids 141 to 160, 151 to 160, and 200 to 213, which are 
located near the C-terminal end of VP1, and amino acids 9 to 24, 17 to 32, 
and 25 to 41, which are located near the N-terminal end of VP1, were each 
bound to a separate inert carrier protein (keyhole limpet hemocyanin) and 
injected into guinea pigs (Fig. 12.11). Very small peptides are usually rap- 










Vaccines 


471 



FIGURE 12.11 Structure of a peptide vaccine composed of identical short peptides 
bound to a carrier protein. 


idly degraded unless they are bound to the surface of a larger carrier mol¬ 
ecule. A single inoculation with peptide 141 to 160 elicited sufficient 
antibody to protect animals against subsequent challenges with FMDV. By 
contrast, inoculation with complete VP1 or peptide 9 to 24, 17 to 32, or 25 
to 41 yielded lower levels of neutralizing antibodies. 

In an additional experiment, a longer peptide consisting of amino acids 
141 to 158 joined to amino acids 200 to 213 by two proline residues elicited 
high levels of neutralizing antibodies in guinea pigs, even when it was 
injected without any carrier protein. This "two-peptide" molecule was 
more effective than either of the single peptides alone and prevented 
FMDV proliferation in cattle, as well as in guinea pigs. 

Although these results were promising, the amount (dose) of peptide 
material that had to be used to elicit an immunological response was 
approximately 1,000 times the amount of inactivated FMDV needed to 
elicit the same response. To overcome this problem, DNA encoding FMDV 
VP1 peptide 142 to 160 was linked to the gene encoding a highly immuno¬ 
genic carrier molecule, hepatitis B virus core antigen (HBcAg). When the 
gene for this fusion protein was expressed in either E. coli or animal cells in 
culture, the protein molecules self-assembled into stable "27-nm particles," 
with the FMDV VP1 peptide located on the outer surface of the particle. 
These particles are highly immunogenic in laboratory animals. Therefore, 
HBcAg may be an effective carrier molecule for such short synthetic pep¬ 
tides. A comparison of the immunogenicities in guinea pigs of a variety of 
FMDV peptide vaccines, all of which contained the VP1 peptide 142 to 160 
sequence, revealed that a fusion protein containing HBcAg and FMDV VP1 
amino acids 142 to 160 was approximately 1/10 as immunogenic as inacti¬ 
vated FMDV particles, 35 times more immunogenic than a fusion protein 
containing £. coli (3-galactosidase and FMDV VP1 amino acids 137 to 162, 
and 500 times more immunogenic than the free synthetic peptide com¬ 
posed of amino acids 142 to 160. Because synthetic peptides fused to 
HBcAg do not interfere with the assembly of the 27-nm hepatitis B virus¬ 
like particles, and because these particles are nearly as immunogenic as the 
intact virus from which the synthetic peptide was derived, this approach 
may become a general method for the delivery of peptide vaccines. 
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Malaria 

The genus Plasmodium consists of approximately 125 known species of 
parasitic protozoa, 5 of which are known to infect humans and cause 
malaria. The Plasmodium life cycle is very complex. Sporozoites from the 
saliva of a biting female mosquito are transmitted to either the blood or the 
lymphatic system and then migrate to the liver and invade liver cells (hepa- 
tocytes) (Fig. 12.12). The parasite buds off the hepatocytes in merosomes 
containing hundreds or thousands of merozoites. These merosomes lodge 
in pulmonary capillaries and slowly disintegrate there, generally over 2 or 
3 days, releasing merozoites. The merozoites invade the red blood cells, 
where the parasite divides several times to produce new merozoites, which 
then leave the red blood cells and travel within the bloodstream to invade 
new red blood cells. The parasite eventually forms gametocytes, which 
may be ingested by feeding mosquitoes. Fusion of the gametes that develop 
from gametocytes leads to the formation of new sporozoites in the mos¬ 
quito that can infect new individuals, spreading the disease. 

In the life cycle of the malaria parasite, it is the asexual blood-stage 
multiplication that is responsible for most of the acute symptoms of the 
disease. In areas where malaria is endemic, some individuals show consid¬ 
erable resistance to the disease despite the fact that when their blood is 
examined they are found to carry the parasite. This resistance to the worst 
symptoms of malaria was shown to be a result of an "antibody-dependent 
cellular-inhibition" mechanism that inhibits parasite development. In other 
words, some individuals who were infected with the malaria parasite 
made antibodies against a parasite protein that prevented the growth of the 
parasite. Following a detailed study, it was determined that the protective 
antibodies targeted merozoite surface protein 3. When this protein was 
examined in different strains of Plasmodium, it was observed that while the 
N-terminal part of the protein varied considerably from one strain to 
another, the C-terminal end of the protein was highly conserved among the 
various isolates of the parasite. It was therefore decided to chemically syn¬ 
thesize peptides that corresponded to small portions of the C terminus of 
merozoite surface protein 3. Human antibodies from individuals who were 
resistant to the parasite were affinity purified based upon their interaction 
with one or more of these peptides. The antibodies that bound to the pep¬ 
tides were then tested in an antibody-dependent cellular-inhibition assay. 
Antibodies directed against peptides B, C, and D (Fig. 12.13) had a major 
inhibitory effect on parasite growth. Based on the ability of peptides B, C, 
and D to bind to and select protective antibodies, a peptide representing 
amino acid residues 181 to 276 of merozoite surface protein 3 was chemi¬ 
cally synthesized. This peptide is currently being tested in clinical trials as 
a novel malaria vaccine. While more research needs to be done, in the 
future, synthetic peptide vaccines could become highly specific, relatively 
inexpensive, safe, and effective alternatives to traditional vaccines. 


Genetic Immunization: DNA Vaccines 

Delivery 

A novel strategy that elicits an antibody response without the introduction 
of an antigen has been developed. In this case, the gene encoding an anti¬ 
genic protein is introduced into cells of a target animal, where the antigen 
is synthesized (Table 12.3). In the initial experiments, gold microprojectiles 



FIGURE 12.12 Infection of an individual with Plasmodium falciparum (a malaria- 
causing parasite) introduced by a mosquito. 


were coated with E. coli plasmid DNA carrying an antigen gene under the 
transcriptional control of an animal virus promoter. A biolistic system was 
used to deliver the microprojectiles into cells in the ears of mice (see chapter 
18 for a more detailed description of the biolistic system). Other workers 
introduced cloned cDNAs into mouse cells by injecting large amounts of 
the plasmid carrying the target DNA directly into the muscles of test ani¬ 
mals. However, effective "genetic immunization" by direct injection into 
muscles (100 gg per mouse) requires 3 to 4 orders of magnitude more DNA 
than the biolistic delivery system (10 to 100 ng per mouse). One distinctive 
feature of genetic immunization is that the costly and time-consuming pro¬ 
cedure of either purifying an antigen or creating a recombinant vaccine 
delivery vehicle is bypassed. Moreover, proteins produced by this proce¬ 
dure are more likely to be correctly posttranslationally modified than are 
proteins that are produced by different host organisms. 

An advantage of genetic immunization, besides bypassing the need for 
purified protein antigens, is that it can trigger a response against only the 
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FIGURE 12.13 Schematic representation of Plasmodium falciparum merozoite surface 
protein 3 and peptides corresponding to portions of the C terminus. The peptides, 
labeled A to F, are drawn to scale, with the numbers above the whole protein indi¬ 
cating the amino acid number (counting from the N terminus). The "final peptide" 
is currently being tested in clinical trials for efficacy as a malaria vaccine. 


protein encoded on the plasmid and not against the plasmid itself. In addi¬ 
tion, when plasmid DNA is introduced into a mammalian system, only 
those genes (or cDNAs) under the control of eukaryotic regulatory signals 
will be transcribed and translated. Antibiotic resistance genes for main¬ 
taining the plasmid in E. coli will not be transcribed or translated, and the 
same vector can be used to deliver different proteins to an individual at the 
same time, or the administration of the same gene can be repeated a 
number of times. 

The feasibility of genetic immunization has been examined in detail. In 
one series of experiments, mice were injected in the quadriceps of both legs 
with an E. coli plasmid carrying the cDNA for influenza A virus nucleopro- 
tein under the transcriptional control of either a Rous sarcoma virus or a 
cytomegalovirus promoter. Although the expression of the nucleoprotein 
was too low to detect, nucleoprotein-specific antibodies were observed in 
the blood of the test mice 2 weeks after the initial injection. In comparison 
to control mice, the nucleoprotein-injected mice were significantly pro¬ 
tected against the lethal effects of influenza virus infection (Fig. 12.14). 
Moreover, the nucleoprotein-injected mice were also protected against a 
different strain of influenza virus. This cross-protection is in sharp contrast 
to traditional influenza virus vaccines, which are directed against surface 
antigens of the virus, so that each vaccine is specific to a single strain of 


TABLE 12.3 Advantages of genetic immunization over conventional vaccines 

Cultivation of dangerous agents is not required. 

Since genetic immunization does not utilize any viral or bacterial strains, there is 
no chance that an attenuated strain will revert to virulence. 

Since no organisms are used, attenuated organisms that many cause disease in 
young or immunocompromised animals are not a problem. 

Approach is independent of whether the microorganism is difficult to grow or 
attenuate. 

Production is inexpensive because protein does not need to be produced or 
purified. 

Storage is inexpensive because of the stability of DNA. 

One plasmid could encode several antigens/vaccines, or several plasmids could 
be mixed together and administered at the same time. 
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FIGURE 12.14 Survival of DNA-immunized mice. Injected mice were immunized 
with DNA that contained the influenza A virus nucleoprotein gene under the con¬ 
trol of the Rous sarcoma virus promoter on an E. coli plasmid. The control mice 
were injected with plasmid DNA only. The x axis represents the number of days 
after the animals were challenged with the live influenza virus. 


influenza virus. In addition, traditional vaccines work only as long as the 
antigens on the surface of the virus do not change. Unfortunately, the genes 
for the surface antigens mutate at a high rate, which creates significant dif¬ 
ferences among strains. Although core components of the virus, such as the 
nucleoprotein, are relatively invariant, they can activate the immune 
system by a mechanism that is different from that of surface antigens. 

The fate of the introduced DNA is not known, and it could have the 
undesirable effect of integrating into the genome of the host cell, possibly 
disrupting an important gene. However, this risk is currently considered to 
be extremely low. It is more likely that the DNA will exist for a short period 
as a nonreplicating extrachromosomal element before it is degraded. To 
date, genetic immunization has been used primarily to induce immune 
responses in animals, and to a more limited extent in humans, against a 
number of pathogenic organisms, including influenza A virus, human 
immunodeficiency virus (HIV) type 1, bovine herpesvirus 1, rabies virus, 
Plasmodium species (which cause malaria), hepatitis B virus, hepatitis C 
virus, bovine rotavirus, bovine respiratory syncytial virus, pseudorabies 
virus, FMDV, Newcastle disease virus, Clostridiuvi tetani (which causes 
tetanus), and Mycobacterium tuberculosis (which causes tuberculosis). 
Several human clinical trials using DNA vaccines are currently ongoing. 

One of the problems with the use of DNA vaccines in large animals and 
humans compared to mice is that the transfection efficiency of introduced 
plasmid DNA is often insufficient to generate a protective immune 
response. One approach to deliver foreign DNA to animal cells utilizes 
biodegradable microscopic (0.3- to 1.5-um) polymeric particles with a cat¬ 
ionic surface that binds the plasmid DNA (Fig. 12.15). Plasmid DNA is 
bound to the surfaces of these "microparticles" and is slowly released over 
a period of 2 to 3 weeks after inoculation of an animal. Using microparti¬ 
cles, it was possible to achieve the same biological effect as with naked 
DNA with about 250-fold less DNA, demonstrating the potential of this 
approach. In addition, the level of antibodies induced by the expression of 
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Plasmid DNA 



FIGURE 12.15 Schematic representation 
of the binding of plasmid DNA to the 
cationic surface of a polymeric micro¬ 
particle. 


plasmid-encoded genes bound to microparticles was significantly enhanced 
by (i) addition of the vaccine adjuvant aluminum phosphate and (ii) the use 
of nanoparticles 0.05 pm in diameter that were coated with poly-L-lysine. 
In contrast to naked DNA, DNA bound to microparticles induced potent 
cytotoxic T-lymphocyte responses at a low dose. 

To date, most DNA vaccines have been delivered either by intramus¬ 
cular or intradermal injection. Although these vaccines can induce a potent 
immune response, they do not induce mucosal immunity. Mucosal immu¬ 
nity can prevent pathogens from entering the body, while systemic immu¬ 
nity deals with pathogens only once they are inside the body. This is an 
important consideration because mucosal surfaces, including the respira¬ 
tory, intestinal, and urogenital tracts, are the major sites of transmission of 
many infectious diseases. However, because of the protective barriers of 
the mucosal surfaces, traditional antigen-based vaccines are largely ineffec¬ 
tive unless they are administered with specific agents that penetrate or 
bind to the mucosa, i.e., mucosal adjuvants. 

Mucosal immunity induces a separate and distinct response from sys¬ 
temic immunity. The antibodies produced as part of the mucosal immune 
response restrict not only mucosal pathogens, but also microorganisms that 
initially colonize mucosal surfaces and then cause systemic disease. Many 
mucosal vaccines are live attenuated organisms that infect mucosal sur¬ 
faces and are effective at inducing mucosal responses. Of these, oral polio 
vaccines and both attenuated Salmonella enterica serovar Typhi Ty21a and 
Vibrio cholerae vaccines are licensed for use in humans. 

DNA vaccines that are designed for delivery to mucosal surfaces are 
similar in principle to those used for intramuscular or intradermal delivery. 
To increase plasmid uptake and decrease its subsequent degradation, var¬ 
ious methods of formulating DNA have been tried. For example, cationic 
(positively charged) liposomes have been used to deliver DNA (which has 
a negatively charged phosphate backbone) to the respiratory tract, and 
DNA entrapment in biodegradable microparticles has been used for the 
oral delivery of foreign DNA. Moreover, to improve the potency of DNA 
vaccines for humans, a number of strategies have been devised, including 
using plasmids that, in addition to encoding a target gene, also express a 
cytokine(s), such as interleukin-2 (IL-2), IL-10, or IL-12 (which can act as an 
intercellular mediator in the generation of an immune response). 

A range of systems, including liposomes, live vectors (bacteria and 
viruses), and a wide range of adjuvants that increase the immune response 
(bacterial toxins, carboxymethylcellulose, lipid derivatives, aluminum 
salts, and saponins), have been tried for delivery of DNA to different cell 
types. Of necessity, various optimization strategies are tested in mice before 
they are tried on larger animals and then on humans, with no guarantee 
that an approach that works well in mice will also be a successful strategy 
in humans. Nevertheless, given the many perceived advantages of genetic 
immunization over the use of conventional vaccines (Table 12.3), this has 
become a very active area of research. For example, electroporation has 
been used to increase the transfection of DNA encoding target antigens. 
With this approach, DNA is injected intramuscularly, and the skeletal 
muscle is immediately electrically stimulated with a pulse generator. 
Despite the fact that this procedure, which causes some patient discomfort, 
results in local tissue injury and inflammation, it is tolerated by patients 
without the need for any anesthesia, and there do not appear to be any 
long-term negative side effects to delivering DNA in this way. Most likely. 
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the electrical pulse increases the transfection efficiency of the added DNA, 
and it is becoming a method of choice for clinically administering DNA 
vaccines. 

A modified strain of the invasive bacterium Shigella flexneri has been 
developed to facilitate the delivery of DNA into animal cells for genetic 
immunization (Fig. 12.16). Shigella can enter animal epithelial cells, 
escaping the phagocytic vacuole, and the bacterium can direct plasmid 
DNA to the nucleus of the host cell, where, if the introduced gene(s) con¬ 
tains a eukaryotic promoter, it is transcribed. Shigella is normally a patho¬ 
genic organism and would not be an acceptable DNA delivery system. 
Therefore, to use Shigella, it was first necessary to construct a nonpatho- 
genic version of the wild-type organism by (1) engineering the bacterium 
to be toxin deficient and (2) making a deletion mutation in the Shigella asd 
gene, which encodes the enzyme aspartate (3-semialdehyde dehydroge¬ 
nase. This enzyme is normally involved in the synthesis of the bacterial 
cell wall constituent diaminopimelic acid; therefore, the mutant cannot 
grow unless diaminopimelic acid is added to the growth medium. Shigella 
strains with the asd mutation can invade animal epithelial cells and deliver 
their plasmid DNA; however, once present, the Shigella cells are unable to 
proliferate. 

Determination of the safety of using Shigella as a vector for the delivery 
of DNA to animal cells must await the results of human trials; however, the 
results of experiments with guinea pigs are promising. The greatest poten¬ 
tial advantage of this approach is that with the Shigella system, DNA for 
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S ince Jenner developed the first 
vaccine over 200 years ago, most 
human vaccines against viral dis¬ 
eases have included partially killed or 
attenuated preparations of the disease- 
causing, or a similar nonpathogenic, 
version of the virus. While this 
approach has been undeniably effec¬ 
tive and prevents the spread of a 
number of viral diseases, it is clearly 
limited. For example, not all viruses 
can be grown in culture, which pre¬ 
cludes the development of vaccines 
against these viruses; production of 
traditional vaccines is expensive and 
potentially dangerous; and not all 
viral diseases are preventable through 
the use of these traditional vaccines. 
With the advent of molecular biotech¬ 
nology, alternative strategies were 
examined for developing safer, less 


expensive, and more effective vaccines 
that would not have the limitations of 
using whole viruses, killed or attenu¬ 
ated, as vaccines. Since vaccines 
immunize individuals by priming 
their immune systems, it was thought 
that for some viruses short synthetic 
peptides might elicit the same anti¬ 
body response as the antigenic deter¬ 
minants normally found on the 
external surface of the virus. These 
peptides were designed with the iden¬ 
tical linear sequence of amino acids 
that made up the viral antigenic deter¬ 
minant in the first place. Of course, 
this approach could be expected to 
work only when the amino acids of an 
antigenic region (epitope) were contig¬ 
uous. 

Bittle et al. isolated and character¬ 
ized the viral RNA for FMDV and 


then determined the sequence of VP1, 
the major antigenic protein of the 
virus. Based on other experiments, 
they reasoned that the major antigenic 
determinants on VP1 would probably 
be found at either the N- or C-terminal 
end of the protein. They then chemi¬ 
cally synthesized a series of peptides 
based on the amino acid sequences of 
the N and C termini, chemically 
linked these peptides to carrier pro¬ 
teins, and then used them to inoculate 
rabbits and guinea pigs. With peptides 
from the C-terminal end of VP1 as 
antigens, the treated animals synthe¬ 
sized antibodies that protected them 
against disease from whole foot-and- 
mouth disease virus. This work estab¬ 
lished the principle that a protein 
domain(s) is sufficient to induce anti¬ 
bodies that can neutralize intact virus 
particles, and hence a new type of vac¬ 
cine, which after the initial stages of 
development does not utilize or 
depend upon the disease-causing 
virus, is possible. 
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FIGURE 12.16 Use of nonpathogenic S.flexneri to deliver foreign DNA to mammalian 
epithelial cells. A strain of Shigella with a deletion mutation in the asd gene, which 
encodes the enzyme (3-semialdehyde dehydrogenase, is unable to proliferate and 
can be used as a live vector. 


vaccination may be delivered orally, greatly simplifying the delivery of a 
variety of vaccines. 

Some limitations of plasmid-based vaccines are (1) the necessity for 
strong promoters that function in vivo to selectively transcribe the intro¬ 
duced DNA, (2) low levels of foreign-gene expression resulting from differ¬ 
ences in codon usage between the introduced gene (often of viral, bacterial, 
or parasitic origin) and the animal being inoculated, and (3) the presence of 
antibiotic resistance genetic marker genes on the plasmid vector. To avoid 
the use of antibiotic resistance marker genes, researchers have developed a 
series of minimalistic /mmunogenically defined gene expression (MIDGE) 
vectors (Fig. 12.17). Following insertion of the gene of interest into a MIDGE 


FIGURE 12.17 Use of a MIDGE vector to produce a capped linear DNA sequence 
containing the gene of interest, a promoter, an intron (which facilitates expression 
of the gene of interest), and a polyadenylation signal. 
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vector, the antibiotic resistance gene is excised from the vector, and an oli¬ 
gonucleotide cap is added specifically to both ends of the linearized DNA 
that contains only a promoter/intron, the gene of interest, and a polyadeny- 
lation signal. The remaining portion of the plasmid is degraded by exonu¬ 
cleases. The capped ends are resistant to exonuclease digestion. The purified, 
capped linear MIDGE vector is then used directly for transfection. These 
vectors have been used successfully as a substitute for plasmid vectors. 

Vaccination of rhesus macaques (monkeys) with DNA encoding simian 
immunodeficiency virus proteins, followed by a booster with a modified 
vaccinia virus that encoded many of the same proteins, protected the mon¬ 
keys against infection by simian immunodeficiency virus (Fig. 12.18). The 
DNA that was injected at 0 and 8 weeks expressed the simian immunode¬ 
ficiency virus proteins Gag, Pol, Vif, Vpx, and Vpr, as well as the HIV type 
1 proteins Env, Tat, and Rev. The recombinant vaccinia virus expressed the 
simian immunodeficiency virus proteins Gag and Pol and the HIV protein 
Env, all under the control of vaccinia virus promoters, and was adminis¬ 
tered at 24 weeks. The protection against simian immunodeficiency virus 
at 7 months after the booster with the recombinant modified vaccinia virus 
was much greater than protection with either treatment by itself. This pro¬ 
cedure also confers immunity against a mucosal viral challenge. This fea¬ 
ture is important because the site of entry of the virus into the simian host 
is effectively blocked. It is also noteworthy that the immunity was main¬ 
tained for a long time. 

Dental Caries 

The gram-positive, facultatively anaerobic bacteria Streptococcus mutans 
and Streptococcus sobrinus are considered to be the primary causative agents 
of dental caries (tooth decay). These organisms colonize tooth surfaces and 
metabolize sucrose to produce lactic acid, which causes the tooth enamel to 
become vulnerable to decay. Sucrose is also used to produce a sticky, extra¬ 
cellular, dextran-based polysaccharide (glucan) that facilitates Streptococcus 
cells' adhering to one another and to tooth surfaces, forming plaque. It is 
the combination of plaque and acid that leads to tooth decay. 

Two regions of one of the adhesion proteins found on the surfaces of S. 
mutans and S. sobrinus cells are important in the initial adherence of these 
bacteria to tooth surfaces: one sequence is rich in alanine residues, while 
the other is rich in proline residues. Another important component of the 
mechanism of tooth decay is the enzyme glucosyltransferase, which is 
responsible for the synthesis of glucan, an insoluble extracellular polymer 


FIGURE 12.18 Vaccination regimen of rhesus monkeys with DNA containing simian 
immunodeficiency virus (SIV) genes and vaccinia virus carrying the same genes. 
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of glucose moieties. A DNA vaccine designed to prevent dental caries 
included the coding sequences for an alanine- and proline-rich peptide, as 
well as the C-terminal domain of a Streptococcus glucosyltransferase. This 
C-terminal domain is necessary for the binding of the glucosyltransferase 
to the bacterial cell surface. This DNA vaccine therefore encoded two sepa¬ 
rate peptides, both of which facilitate the binding of Streptococcus cells to 
the tooth surface. In an attempt to overcome the tendency of many DNA 
vaccines to induce only a weak immune response, the DNA vaccine con¬ 
struct contained two additional elements. First, the extracellular domain of 
cytotoxic T-lymphocyte antigen 4 (CTLA4), which binds to the B7 protein 
that is expressed on the surfaces of antigen-presenting cells, was included 
in the construct (Fig. 12.19). Second, the Fc region of an immunoglobulin G 
molecule, which can bind to the Fc receptor on the antigen-presenting cell, 
was included in this construct. The use of both of these peptides was 
designed to specifically target the multidomain fusion protein (Fig. 12.19) 
to immune system cells and thereby amplify the immune response and 
enhance the efficacy of the vaccine. In fact, rabbits immunized with this 
vaccine, either intranasally or intramuscularly, displayed a significantly 
enhanced, specific systemic and mucosal immune response compared to 
immunization with only the alanine- and proline-rich peptide fused to the 
C-terminal domain of glucosyltransferase. Moreover, it was subsequently 
shown that this DNA vaccine could provide significant protection against 
dental caries in rats that were challenged with S. mutans and S. sobrinus. 
Although the problem of delivering the DNA vaccine to humans still needs 
to be addressed so that its clinical efficacy can be tested, this is a very prom¬ 
ising strategy that could be enormously beneficial to human populations. 


FIGURE 12.19 Schematic representation of a multidomain protein encoded by a DNA 
vaccine. Two domains, the CTLA4 extracellular domain and the immunoglobulin G 
(IgG) Fc region, are designed to target the other two portions of the molecule, i.e., 
the glucosyltransferase C-terminal domain and the alanine-proline-rich peptide, to 
antigen-presenting cells. 
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Attenuated Vaccines 

In some instances, genetic manipulation may be used to construct modified 
organisms (bacteria or viruses) that are used as live recombinant vaccines. 
These vaccines are either nonpathogenic organisms that have been engi¬ 
neered to carry and express antigenic determinants from a target patho¬ 
genic agent or engineered strains of pathogenic organisms in which the 
virulence genes have been modified or deleted. In these instances, as part 
of a bacterium or a virus, the important antigenic determinants are pre¬ 
sented to the immune system with a conformation that is very similar to 
the form of the antigen in the disease-causing organism. Although suc¬ 
cessful in some cases, purified antigen alone often lacks the native confor¬ 
mation and elicits a weak immunological response. 

Cholera 

It is usually advantageous to develop a live vaccine, because they are gen¬ 
erally much more effective than killed or subunit vaccines. The major 
requirement for a live vaccine is that no virulent forms be present in the 
inoculation material. With this objective in mind, a live cholera vaccine has 
been developed. Cholera, caused by the bacterium V. cholerae, is a fast¬ 
acting intestinal disease characterized by fever, dehydration, abdominal 
pain, and diarrhea. It is transmitted by drinking water contaminated with 
fecal matter. In developing countries, the threat of cholera is a real and 
significant health concern whenever water purification and sewage dis¬ 
posal systems are inadequate. 

Since V. cholerae colonizes the surface of the intestinal mucosa, it was 
reasoned that an effective cholera vaccine should be administered orally 
and directed to this structure. With this in mind, a strain of V. cholerae was 
created with part of the coding sequence for the A 1 peptide deleted. This 
strain cannot produce active enterotoxin; therefore, it is nonpathogenic and 
is a good candidate for a live vaccine. 

Specifically, in this experiment, a tetracycline resistance gene was 
incorporated into the A! peptide DNA sequence on the V. cholerae chromo¬ 
some. This insertion inactivated the A 1 peptide activity and also made the 
strain resistant to tetracycline. Although the A 1 peptide sequence has been 
disrupted, the strain is not acceptable as a vaccine because the inserted 
tetracycline resistance gene can excise spontaneously, thereby restoring 
enterotoxin activity. Consequently, it was necessary to engineer a strain 
carrying a defective A 1 peptide sequence that could not revert (Fig. 12.20). 

1. A plasmid containing the cloned DNA segment for the A 1 peptide 
was digested with the restriction enzymes Clal and Xbal, each of 
which cut only within the Aj peptide-coding sequence of the insert. 

2. To recircularize the plasmid, an Xbal linker was added to the Clal 
site and then cut with Xbal. 

3. T4 DNA ligase was used to join the plasmid at the Xbal sites, 
thereby deleting a 550-base-pair segment from the middle of the A 1 
peptide-coding region. This deletion removed 183 of the 194 amino 
acids of the A 1 peptide. 

4. Then, by conjugation, the plasmid containing the deleted A] pep- 
tide-coding sequence was transferred into the V. cholerae strain 
carrying the tetracycline resistance gene within its A 1 peptide DNA 
sequence. 
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5. Recombination (a double crossover) between the remaining Aj 
coding sequence on the plasmid and the tetracycline resistance 
gene-disrupted A 1 peptide gene on the chromosome replaced the 
chromosomal A 1 peptide-coding sequence with the homologous 
segment on the plasmid carrying the deletion. 

6. After growth for a number of generations, the extrachromosomal 
plasmid, which is unstable in V. cholerae, was spontaneously lost. 

7. Cells with an integrated defective A 1 peptide were selected on the 
basis of their tetracycline sensitivity. The desired cells no longer 
had the tetracycline resistance gene but carried the A 1 peptide 
sequence with the deletion. 

A stable strain with an A 1 peptide sequence containing a deletion was 
selected in this way. This strain did not produce active enterotoxin but nev¬ 
ertheless retained all the other biochemical features of the pathogenic form 
of V. cholerae; that is, V. cholerae with an A , peptide containing a deletion is a 
good vaccine candidate because the bacterium that synthesizes only the A 2 
and B peptides is as immunogenic as the native bacterium. When this strain 
was evaluated in clinical trials to test its effectiveness as a cholera vaccine, 
the results were equivocal. While the vaccine conferred nearly 90% protec¬ 
tion against diarrheal disease in volunteers, it induced side effects in some 
of those who were tested. This strain may require modification at another 
chromosomal locus before it can be used as a vaccine. 

Salmonella Species 

Other attempts to engineer nonpathogenic strains of pathogenic bacteria 
that could be used as live vaccines have involved deletions in chromosomal 
regions that code for independent and essential functions. At least two 
deletions are preferred, because the probability that both sets of functions 
can be simultaneously reacquired is very small. It is assumed that a 
"doubly deleted" strain would have a limited ability to proliferate when it 
is used as a vaccine, thereby curtailing its pathogenicity while allowing it 
to stimulate an immunological response. 

Strains of the genus Salmonella cause enteric fever, infant death, typhoid 
fever, and food poisoning. Therefore, an effective vaccine against these 
organisms is needed. Deletions in a number of different genes have been 
used to attenuate various Salmonella strains (Table 12.4). These mutations 
can be grouped into three basic categories: mutations in (1) biosynthetic 
genes, (2) regulatory genes, and (3) genes involved in virulence. In addi¬ 
tion, strains with more than one deletion have been constructed. For 
example, one double-deletion strain has deletions in the aro genes, which 
encode enzymes involved in the biosynthesis of aromatic compounds, and 
in the pur genes, which encode enzymes involved in purine metabolism. 
These double-deletion strains, which can be grown on a complete and 
enriched medium that supplies the missing nutrients, generally establish 
only low-level infections, since their host cells contain only a very low level 
of the metabolites that they require for growth. Typically, their virulence is 
reduced by 100-fold or more. These attenuated Salmonella strains are effec¬ 
tive oral vaccines for mice, sheep, cattle, chickens, and humans. 

Deletion of the dam gene, which encodes DNA methylase, may be a 
highly effective approach to produce avirulent Salmonella strains. The dam 
gene is a master switch that regulates the expression of 20 to 40 different 
Salmonella regulatory proteins. Thus, when mice were immunized with 
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FIGURE 12.20 Strategy for deleting part of the cholera toxin Aj peptide DNA 
sequence from a strain of V. cholerae. Note that the tetracycline resistance gene is 
introduced into the gene for the A1 peptide as part of a transposon. This construct 
no longer makes A1 peptide but cannot be used as a vaccine because it is possible 
for the transposon to be excised, with the result that A1 synthesis and pathogenicity 
are restored. 


Dam-negative strains of Salmonella, they tolerated up to 10,000 times the 
normally lethal dose. Generally, pathogenic bacteria turn on many of their 
genes as briefly as possible to avoid detection and attack by the host's 
immune system. However, with Dam-negative strains, these genes are 
expressed for much longer periods, making it easier for the host immune 
system to detect and destroy the invading bacteria. Because many other 
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TABLE 12.4 Deleted genes and their functions in the development of attenuated 
strains of Salmonella spp. 


Deleted gene 

Gene function 

galE 

Synthesis of lipopolysaccharide; decrease toxicity from 
galactose 

aroA, aroC, or aroD 

Synthesis of chorismate, an aromatic amino acid precursor 
and a PABA precursor; PABA is involved in the synthesis 
of iron chelators 

pur A or purE 

Synthesis of purines 

asd 

Peptidoglycan and lysine biosynthesis 

phoP and phoQ 

Regulation of acid phosphatases and genes necessary for 
survival in the microphage 

cya 

Encodes adenylate cyclase, which is involved in camp 
synthesis 

crp 

Enclosed camp receptor; regulates expression of proteins 
involved in transport and breakdown of carbohydrates 
and amino acids 

cdt 

Involved in tissue colonization by the bacterium 

dam 

Encodes DNA methylase; appears to be a master switch for 
20-40 different virulence genes 

htrA 

Enclosed a stress-induced polypeptide; result in 
significantly reduced persistence in human tissues 


cAMP, cyclic AMP; PABA, p-aminobenzoic acid. 


gut-colonizing bacteria have dam genes, if this approach with Salmonella 
turns out to be as effective as is expected, it may be possible to utilize a 
similar protocol with a range of pathogenic bacteria. 

Leishmania Species 

Although the human immune system can respond to infections by proto¬ 
zoan parasites of the genus Leishmania, it has been difficult to develop an 
effective vaccine against these organisms. Attenuated strains of Leishmania 
are sometimes effective as vaccines; however, they often revert to virulence. 
Also, the attenuated parasite can persist for long periods in an infected but 
apparently asymptomatic individual. Such individuals can act as reservoirs 
for the parasite, which can be transferred to other people by an interme¬ 
diate host. To overcome these problems, an attenuated strain of Leishmania 
that is unable to revert to virulence was created by targeted deletion of an 
essential metabolic gene, such as the one encoding dihydrofolate reductase- 
thymidylate synthase. In one of these attenuated strains, Leishmania major 
E10-5A3, the two dihydrofolate reductase-thymidylate synthase genes that 
are present in wild-type strains were replaced with the genes encoding 
resistance to the antibiotics G-418 and hygromycin. For growth in culture, 
it is necessary to add thymidine to the medium that is used to propagate 
the attenuated (but not the wild-type) strain. In addition, unlike the wild 
type, the attenuated strain is unable to replicate in macrophages in tissue 
culture unless thymidine is added to the growth medium (Fig. 12.21). 
Importantly, the attenuated strain survives for only a few days when 
inoculated into mice; in that time, it does not cause any disease. Moreover, 
this period is sufficient to induce substantial immunity against Leishmania 
in BALB/c mice after administration of the wild-type parasite (Fig. 12.22). 
Since the attenuated parasite did not establish a persistent infection or 
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FIGURE 12.21 Proliferation of wild-type and attenuated L. major in mouse mac¬ 
rophages. At time zero, macrophages were infected with the same amount of sta¬ 
tionary-phase L. major. The wild-type parasite and the attenuated parasite in the 
presence of thymidine were able to proliferate, while the attenuated strain did not 
proliferate in the absence of thymidine in the medium. Adapted from Titus et al., 
Proc. Natl. Acad. Sci. USA 92:10267-10271,1995. 

cause disease, even in the most susceptible strains of mice tested, it is con¬ 
sidered to be a strong candidate vaccine. Following additional experiments 
with animals, it should be possible to test whether this attenuated parasite 
is effective as a vaccine in humans. 

Herpes Simplex Virus 

As with other pathogenic organisms that have been developed as live vac¬ 
cines, portions of the HSV genome have been deleted. Initially, it was 


FIGURE 12.22 Immunity to virulent L. major induced in BALB/c mice inoculated 
with attenuated L. major. At time zero, mice that were previously inoculated with 
attenuated L. major were challenged with virulent L. major, and the sizes of the 
parasite-induced lesions were measured at various times. Control mice were not 
vaccinated with attenuated L. major. Adapted from Titus et al., Proc. Natl. Acad. Sci. 
USA 92:10267-10271, 1995. 
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thought that a strong immune response could be obtained only if the virus 
was able to replicate. However, several vaccines based on nonreplicating 
viruses induce an immune response. Developing an avirulent HSV is 
important, because subunit vaccines so far have been unsuccessful in 
inducing immunity against the virus. To prepare a safe and efficacious live 
HSV vaccine, two deletions at different locations in the viral genome were 
generated independently and then combined to form a double-deletion 
virus. This strain is unable to proliferate in host cells, and the probability 
that both sets of functions can be simultaneously reacquired is very small. 
This replication-defective strain induces protective immunity that can 
reduce acute viral shedding and latent infection. 


Vector Vaccines 

Vaccines Directed against Viruses 

Vaccinia virus, in the form of a live vaccine, has led to the eradication of 
smallpox globally. Vaccinia virus is a member of the poxvirus family. This 
completely sequenced virus has a double-stranded DNA genome that con¬ 
tains 187 kilobase pairs (kb) and encodes approximately 200 different pro¬ 
teins. Vaccinia virus DNA replicates within the cytoplasm of infected cells. 
Cytoplasmic, rather than nuclear, replication and transcription are possible 
because vaccinia virus DNA contains genes for DNA polymerase, RNA 
polymerase, and the enzymes to cap, methylate, and polyadenylate mes¬ 
senger RNA (mRNA). Thus, if a foreign gene is inserted into the vaccinia 
virus genome under the control of a vaccinia virus promoter, it will be 
expressed independently of host regulatory and enzymatic functions. The 
virus can infect humans and many other vertebrates, as well as inverte¬ 
brates. 

In addition to having a broad host range, vaccinia virus is well charac¬ 
terized at the molecular level, is stable for years after lyophilization (freeze¬ 
drying), and is usually a benign virus. For these reasons, it is a strong 
candidate as a vector vaccine. The function of a vector vaccine is to deliver 
and express cloned genes encoding antigens that elicit neutralizing anti¬ 
bodies against pathogenic agents. Unfortunately, the vaccinia virus genome 
is very large and lacks unique restriction sites. Therefore, it is not possible 
to insert additional DNA directly into the viral genome. Of necessity, the 
genes for specific antigens must be introduced into the viral genome by in 
vivo homologous recombination. 

1. The DNA sequence coding for a specific antigen, such as HBcAg, 
is inserted into a plasmid vector immediately downstream of a 
cloned vaccinia virus promoter and in the middle of a nonessential 
vaccinia virus gene, such as the gene for the enzyme thymidine 
kinase (Fig. 12.23A). 

2. This plasmid is used to transfect thymidine kinase-negative animal 
cells in culture, usually chicken embryo fibroblasts, that have pre¬ 
viously been infected with wild-type vaccinia virus, which pro¬ 
duces a functional thymidine kinase. 

3. Recombination between DNA sequences that flank the promoter 
and the neutralizing antigen gene on the plasmid and the homolo¬ 
gous sequences on the viral genome results in the incorporation of 
the cloned gene into the viral DNA (Fig. 12.23B). Although the 
recombination event is rare, the absence of thymidine kinase 
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FIGURE 12.23 Method for the integration into vaccinia virus of a gene whose protein 
product, generally a viral antigen, elicits an immunological response. (A) Plasmid 
carrying a cloned expressible antigen gene. (B) A double-crossover event results in 
the integration of the antigen gene into vaccinia virus DNA. 


activity in the host cells and the disruption of the thymidine kinase 
gene in the recombined virus render the host cells resistant to the 
otherwise toxic effects of bromodeoxyuridine. This selection 
scheme enriches for cell lines that carry a recombinant vaccinia 
virus. 

4. The definitive selection of cells with a recombinant vaccinia virus 
is made by DNA hybridization with a probe for the antigen gene. 

Since thymidine kinase-negative mutants of vaccinia virus arise sponta¬ 
neously at a relatively high frequency of about 1 virus particle in 10 3 to 10 4 , 
a selectable marker is often cotransferred with the target gene. This makes 
it much easier to distinguish a spontaneous thymidine kinase mutant from 
a mutant deliberately generated by homologous recombination. In other 
words, a virus with a spontaneous mutation would not carry the selectable 
marker, whereas a virus that underwent homologous recombination would. 
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The neo gene, which encodes the enzyme neomycin phosphotransferase II 
and confers resistance to the kanamycin analogue G-418, is often used as the 
selectable marker. This gene, unlike some other selectable markers, is quite 
stable once it is inserted into the vaccinia virus genome. 

To avoid disrupting any vaccinia virus genes or the necessity of 
screening for selectable markers, a novel system has been devised in which 
every recombinant virus that can form a plaque will contain and express 
the target gene. Wild-type vaccinia virus contains a gene, vp37, that is 
responsible for the formation of plaques when the virus is grown on an 
animal cell monolayer (Fig. 12.24A). Deleting the vp37 gene and replacing 
it with an £. coli marker gene (Fig. 12.24B) creates a vaccinia virus mutant 
that does not form plaques after 2 to 3 days of growth in cell culture. Target 
genes are introduced into the mutant vaccinia virus by homologous recom¬ 
bination with a transfer vector that carries the vp37 gene, as well as the 
target gene (Fig. 12.24C). If homologous recombination between the non¬ 
plaque-forming mutant and the transfer vector occurs, the viruses that can 
form plaques have acquired the vp37 gene. Also, the target gene is inserted 
into the vaccinia virus genome, and the selectable marker gene is lost. Since 
the vp37 gene has been deleted in the mutant vaccinia virus, it is impossible 
for this mutation to revert to the wild type. Therefore, every virus that 
forms a plaque carries the desired construct. This procedure is simple and 
straightforward, is applicable to the cloning and expression of any target 
gene, does not require any extra marker genes, and does not disrupt any 
vaccinia virus genes. 

A number of antigen genes have been successfully inserted into the 
vaccinia virus genome and subsequently expressed in animal cells in cul¬ 
ture. These antigens include rabies virus G protein, hepatitis B surface 
antigen, Sindbis virus surface proteins, influenza virus NP and HA pro¬ 
teins, vesicular stomatitis virus N and G proteins, and HSV glycoproteins. 
Several recombinant vaccinia virus vehicles have been shown to be effec¬ 
tive vaccines. For example, a recombinant vaccinia virus that expresses the 
HSV-1 gD (glycoprotein D) gene prevents herpes infections in mice. 
Another recombinant vaccinia virus that expresses the rabies virus surface 
antigen gene was able to elicit neutralizing antibodies in foxes, which are 
major carriers of rabies in Europe, and has been used in the field for some 
time, including in an area of approximately 10,000 km 2 in Belgium. The 
vaccinia-rabies virus glycoprotein recombinant virus vaccine that is pres¬ 
ently on the market (Raboral) is a live viral vaccine containing 10 8 plaque¬ 
forming units (PFU), or live viral particles, per dose. It is constructed by 
insertion of the DNA copy coding for glycoprotein G of a rabies virus strain 
into the thymidine kinase gene of a strain of vaccinia virus. Once the vac¬ 
cine is ingested by a fox, the vaccinia virus begins to replicate and express 
rabies glycoprotein G, which stimulates the development of immune 
responses to the rabies glycoprotein. This results in the production of neu¬ 
tralizing antibodies against the rabies virus in the immunized foxes. This 
immunity typically lasts about 12 months in cubs and 18 months in adult 
animals. 

The use of vector vaccines constructed from vaccinia virus also offers 
the possibility of vaccinating individuals against several different diseases 
with one treatment. This may be achieved by using a recombinant vaccinia 
virus carrying cloned genes encoding a number of different antigens. 

The timing of the production of a foreign protein whose gene is carried 
in a vaccinia virus depends on whether a vaccinia virus promoter functions 


Vaccines 489 


A 


Wild-type vaccinia virus genome 
Left flank vp37 gene Right flank 


B Mutant vaccinia virus genome 


Left flank p7 .5 Marker gene Right flank 


C 


Vaccinia virus transfer vector 


Left flank 


vp37 gene 


p7 .5 MCS Target gene Right flank 


v_/ 

FIGURE 12.24 (A) Portion of a wild-type vaccinia virus genome that contains the vp37 
gene that is responsible for plaque formation in host cells. (B) Portion of a mutant 
vaccinia vims genome in which the vp37 gene has been replaced by a marker gene. 
(C) Portion of a vaccinia virus transfer vector. "Left flank" and "right flank" refer to 
the DNA sequences that immediately precede and follow the vp37 gene in the wild- 
type vaccinia virus genome. The native vp37 promoter is part of the vp37 gene 
sequence (not shown). MCS is a multiple cloning site with seven unique restriction 
enzyme sites. p7 .5 is a strong early/late vaccinia virus promoter. The target gene is 
inserted into the multiple cloning site. Subsequently, homologous recombination 
between the transfer vector (C) and the genomic DNA of the mutant vims (B) 
results in the replacement of the E. coli marker gene with the vp37 gene, together 
with a target gene. 


during the early or late phase of the infection cycle, and the strength of the 
promoter determines the amount of an antigen that is produced. For the 
most part, late promoters for an 11-kilodalton (kDa) protein (pll) and the 
cowpox virus A-type inclusion protein (pCAE) have been used to achieve 
high levels of foreign-gene expression. When genes encoding several dif¬ 
ferent foreign proteins are inserted into one vaccinia virus, each is placed 
under the control of a different vaccinia virus promoter to avoid the pos¬ 
sibility of homologous recombination between different portions of the 
virus genome that might cause the cloned genes to be lost. 

A live recombinant viral vaccine has several advantages over killed 
virus or subunit vaccines. First, the virus can express the authentic 
antigen(s) in a manner that closely resembles a natural infection. Second, 
the virus can replicate within the host, thereby amplifying the amount of 
antigen that activates the release of antibodies from B cells (humoral 
response) and stimulates the production of T cells (cell-mediated immune 
response). 

A disadvantage of using a live recombinant viral vaccine is that vacci¬ 
nation of an immunosuppressed host, such as an individual with AIDS, can 
lead to a serious viral infection. One way to avoid this problem may be to 
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insert the gene encoding human IL-2 into the viral vector. IL-2 enhances the 
response of the T cells of the immune system, enabling the recipient to limit 
the proliferation of the viral vector, and thereby decreases the possibility of 
an unwanted infection. 

If the proliferation of vaccinia virus has deleterious effects in certain 
patients, it would be helpful to kill or inhibit it after vaccination. One 
approach is to create an interferon-sensitive vaccinia virus—wild-type vac¬ 
cinia virus is relatively resistant to interferon—whose proliferation is cur¬ 
tailed. Such a virus vector would be susceptible to drug intervention if 
complications from vaccination with vaccinia virus vectors arose. 

The basis of the resistance of vaccinia virus to interferon was not 
known until a vaccinia virus open reading frame (K3L) was found to 
encode a 10.5-kDa protein that has an amino acid sequence that is very 
similar to that of a portion of the 36.1-kDa host cell eukaryotic initiation 
factor 2a (eIF-2a). The N-terminal regions of both of these proteins contain 
87 amino acids that are nearly identical. Moreover, this shared sequence 
contains a serine residue, amino acid 51, which in eIF-2a is normally phos- 
phorylated by interferon-activated PI kinase. When this serine residue in 
eIF-2a is phosphorylated in interferon-treated cells, protein synthesis, and 
therefore viral replication, is inhibited. Thus, vaccinia virus may avoid 
inhibition by interferon because the K3L protein acts as a competitive 
inhibitor of eIF-2a phosphorylation (Fig. 12.25). Therefore, deletion of all or 
a portion of the K3L gene from vaccinia virus should make the virus sensi¬ 
tive to interferon. A K3L-negative mutant of vaccinia virus was constructed 
by PCR mutagenesis of the K3L gene carried on a plasmid, followed by 
homologous recombination to replace the wild-type K3L sequence with the 
modified version. When the wild-type and mutant versions of vaccinia 
virus were tested for sensitivity to interferon, the mutant was 10 to 15 times 
more sensitive to interferon than was the wild-type version (Fig. 12.26). 
Reinsertion of the wild-type K3L sequence into the mutant virus restored 
the level of interferon sensitivity found in the wild type. This indicates that 
K3L is indeed involved in the interferon resistance phenotype of vaccinia 
virus. This work is an important step in the development of safer vaccinia 
virus vectors. Moreover, other interferon-resistant viruses may contain 
sequences comparable to K3L and therefore may be amenable to the con¬ 
struction of interferon-sensitive deletion mutants. Other, comparable 
approaches to the creation of attenuated strains of vaccinia virus have been 
developed. For example, a strain of vaccinia virus, constructed to have a 
mutation in the B8R gene, which encodes an interferon viroreceptor, is less 
pathogenic for mice than the parental viral strain. 

Currently, several veterinary vaccinia virus-based vaccines have been 
licensed, and clinical studies to test their efficacies in preventing a number 
of human infectious diseases are under way. This technology is based in 
part on the development of an attenuated version of the vaccinia virus 
strain that had previously been used in the eradication of smallpox. To 
avoid any risk of the vector itself becoming a source of disease, some 
genetic information was removed from the virus genome so that the viral 
vector was highly attenuated. This attenuated virus has been used to 
express a number of viral antigens with the expectation that the recombi¬ 
nant virus would be an effective live vaccine. Protection has been achieved 
by cloning glycoproteins from porcine pseudorabies virus, hemagglutinin 
glycoproteins from equine influenza virus, a spike protein from the SARS 
virus, and a polyprotein of Japanese swine encephalitis virus and then vac- 
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FIGURE 12.25 Competitive inhibition of the interferon-stimulated phosphorylation 
(inhibition) of eIF-2a by protein K3L, which is encoded by vaccinia virus and is 
nearly identical to a portion of eIF-2a. (A) In the presence of interferon, a kinase is 
activated that phosphorylates eIF-2a molecules and thereby prevents them from 
functioning. (B) When vaccinia virus protein K3L is also present, it is phosphory- 
lated instead of the eIF-2a, so the eIF-2a remains active. The thickness of the arrows 
represents the relative flux through each pathway. 


cinating humans to prevent transmission of these viruses. Based on the 
success of these attenuated vaccinia virus vaccines, it has been proposed 
that this virus be considered a general delivery system for a wide range of 
proteins. 

For mass vaccination campaigns in developing countries, it would be 
advantageous to be able to deliver live vaccines in a simple, expeditious, 
and cost-effective manner. In addition, with mucosally transmitted patho¬ 
gens, such as FIIV, traditional vaccination routes may not induce mucosal 
immune responses sufficient to provide protective immunity. One possible 
alternative to traditional vaccination is aerosol immunization, which is 
potentially safer, easier, and less expensive to administer. To this end, 
researchers tested the abilities of two attenuated vaccinia virus-based vec¬ 
tors to be delivered effectively by aerosol immunization. In fact, it was 
found that aerosol delivery was both safe and effective, yielding long- 
lasting systemic and mucosal immune responses when delivered to rhesus 
macaques (monkeys). This approach still needs to be tested with humans; 
however, it could offer an effective means of inoculating large numbers of 
individuals in the future. 

Although much of the work on the development of live viral vaccines 
has been done with vaccinia virus, other viruses, such as adenovirus, polio¬ 
virus, and varicella-zoster virus, are also being tested as potential vaccine 
vectors. Live attenuated poliovirus can be delivered orally, and such a 
mucosal vaccine, which is directed to receptors in the lungs or gastrointes¬ 
tinal tract, might also be useful against a range of diseases, including 
cholera, typhoid fever, influenza, pneumonia, mononucleosis, and rabies. 
Fiowever, the safety and efficacy of any apparently benign virus as a gene 
delivery and expression system must be firmly established before clinical 
trials are undertaken. 
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FIGURE 12.26 Sensitivity of wild-type (K3L + ) and mutant (K3L ) vaccinia virus to 
interferon. Mouse L929 cells were pretreated with a mixture of mouse alpha and 
beta interferon for 24 hours before the virus was added. Plates without interferon 
had approximately 3 x 10 7 viral plaques per plate. Adapted from Paoletti and 
Tartaglia, U.S. patent 5,378,457,1995. 


Vaccines Directed against Bacteria 

Since the discovery and subsequent widespread dissemination of antibi¬ 
otics, only a modest amount of research has been directed toward the 
development of vaccines for bacterial diseases. However, there are good 
reasons for developing bacterial vaccines: 

• Not all bacterial diseases are readily treated with antibiotics. 

• The use of antibiotics over the last 40 years has resulted in the pro¬ 
liferation of bacterial strains that are resistant to several antibiotics. 

• Reliable refrigeration facilities for the storage of antibiotics are not 
commonly available in many tropical countries. 

• It is often difficult to ensure that individuals receiving antibiotic 
therapy undergo the full course of treatment. 

Given the need to produce vaccines that will be effective against bacte¬ 
rial diseases, the question is. Which strategies are likely to be most effec¬ 
tive? In instances where the disease-causing bacterium does not grow well 
in culture, the development of an attenuated strain is not feasible. For these 
bacteria, alternative approaches must be used. For example, Rickettsia rick- 
ettsii, a gram-negative obligately intracellular bacterium that causes Rocky 
Mountain spotted fever, does not grow in culture. In this case, a cloned 155- 
kDa protein that is a major surface antigen of R. rickettsii was used as a 
subunit vaccine and was found to protect immunized mice against infec¬ 
tion by this disease-causing bacterium. 

Tuberculosis. Tuberculosis, one of the most important infectious diseases 
worldwide, is caused by the bacterium M. tuberculosis. The bacterium can 
form lesions in any tissue or organ, which leads to cell death. The lungs are 
most commonly affected. Patients suffer fever and loss of body weight, and 
without treatment, tuberculosis is often fatal. It is estimated that approxi¬ 
mately 2 billion people are currently infected with the organism and that 
approximately 2 million to 3 million deaths a year result from these infec- 
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tions. Over the past 50 years, antibiotics have been used to treat patients 
infected with M. tuberculosis. However, numerous multidrug-resistant 
strains of M. tuberculosis are now prevalent. In the United States, among 
HIV patients infected with an antibiotic-resistant strain of M. tuberculosis, 
there is a 50% mortality rate within 60 days. Consequently, a bacterial dis¬ 
ease that was thought to be under control has become a serious public 
health problem in many parts of the world. 

Currently, in some countries, bacillus Calmette-Guerin (BCG), an 
attenuated strain of Mycobacterium bovis that was developed between 1906 
and 1919, is used as a vaccine against tuberculosis. However, the use of this 
vaccine has some drawbacks. First, live BCG cells can cause tuberculosis in 
immunocompromised individuals, such as AIDS patients. Second, indi¬ 
viduals treated with BCG respond positively to a common tuberculosis 
diagnostic test, which makes it impossible to distinguish between individ¬ 
uals infected with M. tuberculosis and those inoculated with BCG cells. For 
these reasons, the BCG strain is not approved for use in a number of coun¬ 
tries, including the United States. 

In an attempt to determine whether a safer and more effective vaccine 
against tuberculosis might be developed, the extent of the immunoprotec- 
tion elicited by purified M. tuberculosis extracellular proteins was exam¬ 
ined. Following growth of the bacterium in liquid culture, 6 of the most 
abundant of the approximately 100 secreted proteins (Fig. 12.27) were puri¬ 
fied. Each of these proteins was used separately and then in combination to 
immunize guinea pigs. The immunized animals were then challenged with 
an aerosol containing approximately 200 cells of live M. tuberculosis —a 
large dose for these animals. The animals were observed for 9 to 10 weeks 
before their lungs and spleens were examined for the presence of disease- 
causing organisms. In these experiments, some of the purified protein 
combinations provided a slightly lower level of protection against weight 
loss, death, and infection of lungs and spleen than did the live BCG vac¬ 
cine. Prominent among the proteins that provided protection was the M. 
tuberculosis major secretory protein, a 30-kDa mycolytransferase also 
known as a-antigen, or antigen 85B. However, a DNA vaccine encoding 


FIGURE 12.27 Schematic representation of the development of a multiprotein subunit 
vaccine for tuberculosis. The six most abundant secreted proteins from M. tubercu¬ 
losis are purified from the growth medium and then tested for the ability to induce 
antibodies in guinea pigs. The immunized animals are subsequently challenged 
with M. tuberculosis. 
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FIGURE 12.28 The plasmid construct 
used to transform BCG to make it a 
more effective vaccine. The plasmid is 
isolated from E. coli cells and then 
introduced into BCG by electropora¬ 
tion. ori E, E. coli origin of replication; 
ori M, Mycobacterium origin of replica¬ 
tion; Hyg r gene, hygromycin resistance 
gene (and its promoter); P and 
a-antigen, the promoter and the coding 
region of the 30-kDa secreted protein. 


this protein was even less effective than the purified secreted protein. 
While this and possibly other M. tuberculosis -secreted proteins might even¬ 
tually be part of a safe and efficacious vaccine for the prevention of tuber¬ 
culosis in humans, it is necessary to develop a suitable delivery system for 
them. In theory, the optimal delivery system for an antigen that provides 
protection against tuberculosis should be (1) able to multiply in the mam¬ 
malian host, (2) nonpathogenic, and (3) able to express and secrete the 
protective antigen. All of these requirements are satisfied by the available 
BCG strain. Therefore, an £. coZz-mycobacterium shuttle vector that con¬ 
tained the gene for the 30-kDa protein (a-antigen) under the control of its 
own promoter was introduced into two different BCG strains (Fig. 12.28). 
Transformed cells produced 2.0- to 5.4-fold more 30-kDa protein than did 
nontransformed cells. In addition, despite the fact that the introduced 
genes were plasmid encoded and therefore potentially unstable, trans¬ 
formed cells continued to express a high level of 30-kDa protein after the 
vaccination of a test animal. In agreement with the hypothesis that the 
extracellular proteins of intracellular organisms are key immunoprotective 
molecules, guinea pigs immunized with transformed BCG strains had sig¬ 
nificantly fewer bacilli in their lungs and spleens. In addition, there were 
smaller and fewer lesions in their lungs, spleens, and livers, and the sur¬ 
vival of the animals was significantly increased, compared with animals 
vaccinated with a nontransformed BCG strain (Fig. 12.29). This is the first 
report of a vaccine against tuberculosis that is more potent than the cur¬ 
rently available commercial vaccine. This vaccine is currently in clinical 
trials; if it is successful, it could save tens of thousands of lives. Moreover, 
it is possible to prepare dried preparations of BCG in which individual 
bacteria form rod-like structures, 1 to 4 pm long and 0.2 to 0.4 pm in diam¬ 
eter, that may serve as the basis for a live bacterial vaccine that is delivered 
as an aerosol, thereby facilitating the inoculation of newborn infants. 

Bacteria as Antigen Delivery Systems 

Antigens that are located on the outer surface of a bacterial cell are more 
likely to be immunogenic than are those in the cytoplasm. Thus, one strategy 


FIGURE 12.29 Survival of guinea pigs infected (challenged) with a pathogenic strain 
of M. tuberculosis. BCG is the traditional live bacterial vaccine strain. 
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is to place a neutralizing antigen from a pathogenic bacterium on the surface 
of a live nonpathogenic bacterium. Flagella are made up of filaments of a 
single protein called flagellin. Under a microscope, they appear as conspic¬ 
uous threadlike structures on the outer surfaces of some bacteria. If the fla¬ 
gella of a nonpathogenic organism could be made to carry a specific epitope 
from a pathogenic bacterium, protective immunogenicity might be easily 
achieved. 

This strategy was used to engineer a cholera vaccine (Fig. 12.30). A syn¬ 
thetic oligonucleotide specifying an epitope of the cholera toxin B subunit 
was inserted into a portion of the Salmonella flagellin gene that varies con¬ 
siderably from one strain to another (hypervariable segment). The construct 
was then introduced into a flagellin-negative strain of Salmonella. The 
epitope, which consisted of amino acid residues 50 to 64 of the cholera toxin 
B subunit, was shown previously to elicit antibodies directed against intact 
cholera toxin. The chimeric flagellin functioned normally. Furthermore, the 
epitope was present at the flagellum surface. Immunization of mice by 
intraperitoneal injections of approximately 5 x 10 6 live or formalin-killed 
"flagellum-engineered" bacteria elicited high levels of antibodies directed 
against both the peptide, i.e., amino acids 50 to 64, and the intact cholera 
toxin molecule. Two or three different epitopes can be inserted into a single 
Salmonella flagellin gene, thereby creating a multivalent bacterial vaccine. 

Attenuated Salmonella strains can be administered orally, which would 
enable them to deliver a range of bacterial, viral, and parasite antigens to 
the mucosal immune system. For this purpose, the choice of the promoter 
that drives the transcription of the foreign antigen is important. If too 
strong a promoter is used, the metabolic load might constrain bacterial 
proliferation. Moreover, unlike a closed system, such as a fermentation 
vessel, shifting the temperature or adding specific metabolites to induce 
foreign-gene expression is not possible when the bacterial vector is added 
to a host animal. On the other hand, promoters that respond to environ¬ 
mental signals may provide effective means of controlling the expression of 
the foreign antigen gene. For example, the £. coli nirB promoter, which is 


FIGURE 12.30 Using Salmonella as an antigen delivery system and a flagellin-antigen 
fusion protein for presenting the antigen to the host immune system. A flagellin- 
negative strain of Salmonella was transformed with a plasmid containing a synthetic 
oligonucleotide specifying an epitope of the cholera toxin B subunit inserted into a 
hypervariable region of a Salmonella flagellin gene. 
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regulated by both nitrite and the oxygen tension of the environment, 
becomes active under anaerobic conditions. In one series of experiments, 
the nirB promoter was used to direct the expression of the non toxic immu¬ 
nogenic fragment C of C. tetani toxin (tetanus toxin) in an attenuated strain 
of Salmonella. It is estimated that more than 1 million deaths per year in the 
developing world are the result of C. tetani infections. When the engineered 
Salmonella strain was grown aerobically in culture, tetanus toxin fragment 
C was not synthesized; however, following oral administration of the bac¬ 
terium to test mice, fragment C was produced, and the animals generated 
antibodies against the peptide. Thus, the engineered Salmonella strain has 
potential as a live oral tetanus vaccine. 

The spiral-shaped, gastrointestinal, and microaerophilic gram-negative 
bacterium Helicobacter pylori is widely distributed among human popula¬ 
tions. It is believed to be the causative agent for a number of gastrointes¬ 
tinal diseases, including chronic gastritis, peptic ulcers, gastric lymphoma, 
and gastric cancer. Among infected individuals, which includes more than 
half of the world's population, about 10% are at risk of developing peptic 
ulcers. In recent years, the medical treatment for peptic ulcers has changed 
from antacids to antibiotics and proton pump inhibitors. The antibiotics 
eradicate the H. pylori infection, while the proton-pump inhibitors block the 
enzyme hydrogen-potassium ATPase, preventing the production of acid 
from the parietal cells at the gastric mucosa, which facilitates the healing of 
the mucosa. 

Unfortunately, H. pylori is resistant to a number of commonly used 
antibiotics, including metronidazole, amoxicillin, erythromycin, and 
clarithromycin. Treatment of H. pylori requires multidrug regimens because 
the organism resides in a layer of mucus that acts as a barrier to antibiotic 
penetration. In addition, the necessary course of antibiotic treatment is too 
expensive for populations of less developed countries. 

Colonization of the gastrointestinal tract by H. pylori is facilitated by 
the action of an H. pylori-e ncoded urease. This enzyme hydrolyzes urea to 
carbon dioxide and ammonia, thereby neutralizing stomach acid, making 
it possible for the bacterium to survive, bind, and function in the host. 
Urease is a cytosolic and surface-exposed nickel metalloenzyme and is one 
of the most abundantly expressed proteins in H. pylori. The enzyme com¬ 
prises two subunits, A and B, that assemble into a complex [(a(3)3]4 supra- 
molecular structure. Subunit B is more antigenic, making it a possible 


FIGURE 12.31 Schematic representation of an attenuated strain of S. enterica serovar 
Typhi transformed with a plasmid encoding H. pylori urease subunits A and B 
under the transcriptional control of a Salmonella promoter (P). The arrow indicates 
the direction of transcription. 
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vaccine candidate. To develop a vaccine that protects individuals against H. 
pylori infections, the genes encoding H. pylori urease subunits A and B were 
constitutively expressed under the control of a Salmonella promoter in a 
genetically deleted (attenuated) strain of S. enterica serovar Typhi (Fig. 
12.31). Neither immunization with urease-expressing S. enterica serovar 
Typhi alone nor immunization with the purified urease enzyme plus an 
adjuvant conferred protection against challenge with a mouse-adapted 
strain of H. pylori. On the other hand, a vaccination protocol that combined 
both urease-expressing S. enterica serovar Typhi and urease plus an adju¬ 
vant was protective. While the success of this approach remains to be estab¬ 
lished in humans, these initial results are nevertheless encouraging and 
give hope that a human vaccine against H. pylori will be developed in the 
near future. 


SUMMARY 


T raditionally, vaccines have been either inactivated or 
attenuated infectious agents (bacteria or viruses) that are 
injected into an antibody-producing organism to produce 
immunity. There are a number of drawbacks to these vaccines. 
For example, not all pathogenic organisms can be grown to 
the large volumes needed to make a vaccine, there are safety 
concerns when large volumes of pathogenic organisms are 
being handled, attenuated strains may revert to the infectious 
state, inactivation may be incomplete, and shelf life is often 
dependent on refrigeration. 

Recombinant DNA technology has been used in various 
ways to create reliable vaccines. Immunologically active, non- 
infectious agents are produced by deleting the genes that 
cause virulence; with this deletion, a live vaccine would never 
be able to revert to the infectious form. A gene(s) that encodes 


the major antigenic determinant(s) from a pathogenic organism 
can be cloned into the genome of a benign carrier organism 
(usually a virus or bacterium), which can be used as a vaccine 
without concern that any pathogenic organisms are present. 
The genes or segments of genes that encode the major anti¬ 
genic determinants of pathogenic organisms can be cloned 
into expression vectors, and large amounts of the product can 
be harvested, purified, and used as a vaccine. With the last 
strategy, complete genes produce subunit vaccines, and cloned 
domains of the major antigenic determinants produce peptide 
vaccines. Peptide vaccines may also be produced by chemical 
peptide synthesis. As an alternative to using cloned antigenic 
proteins or peptides for inoculation, DNA constructs encoding 
the antigenic protein or peptide may be utilized. These DNA 
constructs may be delivered directly to animals or humans. 
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REVIEW QUESTIONS 


1. Briefly describe a protocol for developing a vaccine against 
a toxin-producing bacterium. 

2. What factors limit the usefulness of conventional vaccines? 

3. As part of your work with an international animal health 
organization, you are given the task of developing a vaccine 
against a bovine virus that is the cause of tens of thousands of 
cattle deaths around the world annually. The viral genome 
consists of a 10-kb linear piece of single-stranded RNA with a 


poly(A) tail that encodes eight different proteins. The virus 
does not have a viral envelope, and the major antigenic deter¬ 
minant is the capsid protein viral protein 2. Outline an experi¬ 
mental strategy to develop a vaccine against this virus. 

4. Discuss the development of peptide vaccines that are 
directed against viruses. 

5. What is vaccinia virus, and how can it be used to produce 
unique live recombinant vaccines? 
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6. As an employee of the World Health Organization, you 
have to decide on the best strategy for eradicating rabies in 
wild animal populations. Assuming that you must choose 
between a peptide and a vaccinia virus-based vaccine, select 
one type of vaccine and justify your choice. 

7. Discuss the advantages of a live recombinant viral vaccine 
over killed and subunit vaccines. 

8. Discuss some of the different strategies that have been used 
to produce vaccines against cholera. 

9. How would you develop a subunit vaccine against HSV? 

10. How can bacteria be used as part of a DNA vaccine 
delivery system? 

11. How can vaccinia virus be made more sensitive to inter¬ 
feron? Explain. 


12. How would you develop a vaccine against S. aureus? 

13. Suggest several methods that you could use to deliver 
DNA for genetic immunization to animal cells. 

14. What are MIDGE vectors, and how can they be used to 
facilitate genetic immunization? 

15. How would you develop a vaccine against human papil¬ 
lomavirus? 

16. How would you develop an effective DNA vaccine 
against dental caries? 

17. How would you improve the traditional vaccine against 
tuberculosis? 
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Synthesis of Commercial 
Products by Recombinant 
Microorganisms 


T O DATE, MOLECULAR BIOTECHNOLOGY RESEARCH has focused largely On 
the production of a range of different proteins, including enzymes 
that are used commercially However, recombinant DNA techniques 
can also be used to enhance the production of low-molecular-weight com¬ 
pounds, such as vitamins, amino acids, dyes, precursors of biopolymers, 
and antibiotics. In these cases, the host microorganism is engineered to 
become a factory for the production of useful metabolites. 


Restriction Endonucleases 

Recombinant DNA technology would not be possible without a ready 
supply of different restriction endonucleases. Currently, more than 300 dif¬ 
ferent restriction endonucleases are commercially available, with world¬ 
wide sales in 2007 in the range of $350 million. These enzymes occur 
naturally in many different microorganisms, including species that are 
aerobic, anaerobic, photosynthetic, diazotrophic, mesophilic, thermophilic, 
psychrophilic, and either slow or fast growing. For each of these organisms, 
a detailed fermentation protocol—specifying the temperature, pH, medium 
composition, and oxygen tension—has to be developed and optimized to 
achieve the maximum yield of the target restriction enzyme. To avoid 
having to maintain a large number of different microorganisms, stock a 
very wide range of microbial growth medium components, design several 
different types of fermenters, and spend an inordinate amount of time 
developing optimal growth conditions for a large number of different 
organisms (one major supplier of restriction enzymes lists 265 different 
enzymes in its catalog), investigators often clone restriction endonuclease 
genes into Escherichia coli. Exclusive use of E. coli allows bioengineers to 
standardize the production conditions for all restriction endonucleases. In 
addition, £. coli cells grow rapidly to high cell densities and can be engi¬ 
neered to significantly overexpress each target restriction enzyme. 

Although the technology for isolating and expressing foreign genes in E. 
coli and some other host organisms is well established, it should be remem¬ 
bered that the host organism is a living entity that can be dramatically 
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affected by the production or presence of a heterologous protein. For 
example, overexpression of a heterologous protein may drain the host 
organism of important metabolic resources and, as a result, adversely affect 
its growth. In addition, the presence of a heterologous protein may be lethal 
to the host. For example, restriction endonucleases digest DNA at sites that 
are present on all DNA molecules. As a result, an organism that expresses a 
cloned restriction endonuclease gene is likely to have its own DNA degraded 
unless a protection mechanism is present. 

Microorganisms that make restriction endonucleases have evolved a 
self-protection system. Methylation of one or more of the bases of the DNA 
within the recognition sequence prevents the homologous restriction endo¬ 
nuclease from cutting the DNA at this site (Fig. 13.1A). Gram-negative 
microorganisms have an added mode of protection that entails localizing 
the restriction endonuclease within the periplasmic space and the methyla¬ 
tion (modification) enzyme in the cell cytoplasm (Fig. 13.IB). This compart- 
mentalization physically separates the restriction endonuclease from the 
DNA while ensuring that the modification enzyme has ready access to the 
chromosomal DNA. In addition to protecting the cell from the toxic effects 
of the restriction enzyme, this segregation provides a cellular defense 
against attack by any foreign DNA, such as DNA from a bacterial virus, 
that might enter the periplasm. 

One way to circumvent the problem of host DNA degradation by het¬ 
erologous restriction endonucleases is to clone and express the genes for 
both the restriction enzyme and its specific (cognate) modification enzyme 
in the host organism. Cloning both of these genes into the same organism 
is technically complex unless both the restriction endonuclease and methy¬ 
lation genes are close to each other on the chromosome. In addition, to 
prevent the digestion of the host DNA by the restriction endonuclease, it is 
imperative that, after transformation, the methylation activity be expressed 
prior to the production of the restriction endonuclease. 

One of the first restriction enzyme genes to be cloned (into £. coli) 
encoded the enzyme PstI from Providencia stuartii, a gram-negative bacte¬ 
rium (Fig. 13.2). It is important to note that, for a particular genus and spe¬ 
cies, only some strains encode restriction enzymes. Therefore, in this 
strategy, in order to easily transform the host E. coli strain without 
degrading the input plasmid DNA, it was necessary to utilize an £. coli 
strain that was unable to synthesize the enzyme EcoRI. 

FIGURE 13.1 (A) Protection of DNA from digestion by a restriction endonuclease by 
prior treatment with methylase, a methylating (modification) enzyme. The asterisks 
indicate the presence of a methylated base. (B) Cytoplasmic localization of modifi¬ 
cation enzyme (M) and periplasmic localization of restriction endonuclease (R) in 
gram-negative bacteria. 


A 


Restriction 

enzyme 

V 


Methylase 


* 






¥ 

Restriction 

enzyme 

f 

* 





* 





























Synthesis of Commercial Products by Recombinant Microorganisms 


503 


pBR322 




| Transform into E. coli 
Grow in liquid culture 
| Infect with bacteriophage X 

Only lysis-resistant colonies will grow 


FIGURE 13.2 Method for cloning and selecting the gene for the restriction enzyme 
Pstl. The P. stuartii chromosomal DNA is digested with Hindlll and ligated into the 
Hindlll site of plasmid pBR322. Transformants are grown in liquid medium before 
being infected with bacteriophage X. The resistance of some transformants to lysis 
by X is due to the presence and expression of a cloned Pstl gene and its cognate 
methylation enzyme. 


1. The chromosomal DNA from P. stuartii was digested with Hindlll 
and ligated into the Hindlll site on plasmid pBR322. 

2. Following the introduction of the P. stuartii clone bank (library) 
into E. coli HB101 cells, transformants were grown in liquid 
medium before being infected with bacteriophage X to test for the 
production of the restriction enzyme. When a restriction enzyme 
gene is expressed, the host cells become resistant to the lytic action 
of DNA bacteriophages such as X because the restriction endonu¬ 
clease extensively degrades the infecting bacteriophage DNA. 
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3. Transformants that were resistant to lysis by X were grown, and 
samples were osmotically shocked to release the periplasmic pro¬ 
teins, which were assayed for PstI restriction enzyme activity 

4. Positive clones were assayed for PstI methylase activity 

One positive clone from this experiment contained, within a 4.0-kilo- 
base-pair (kb) DNA fragment, an intact PstI restriction endonuclease and 
methylation operon, including the P. stuartii promoter. In this construct, the 
natural temporal order of synthesis—the methylation enzyme preceding 
the restriction endonuclease—was maintained. The level of the PstI restric¬ 
tion enzyme expressed in £. coli was approximately 10-fold higher than 
that in P. stuartii. As expected, PstI was localized in the periplasm, and the 
methylation enzyme was localized in the cytoplasm. Production of PstI 
using this E. coli clone is simpler and more efficient than production with 
P. stuartii. 

Another strategy also has been used to isolate the genes for restriction 
and modification (methylation) enzyme systems. It was developed by a 
company that eventually became one of the world's leading suppliers of 
restriction enzymes and consists of the following steps. 

1. A clone bank was made from the DNA of a donor organism that 
had a previously identified restriction endonuclease. The plasmid 
vector had at least one recognition site for the target restriction 
endonuclease. 

2. The clone bank was introduced into E. coli by transformation. This 
step increased the amount of recombinant plasmid DNA and also 
allowed the expression of the modification enzyme. 

3. Plasmid DNA was isolated from transformed cells that had been 
grown in liquid media under conditions that selected for the pres¬ 
ence of the plasmid. 

4. The plasmid DNA preparation was treated with the target restric¬ 
tion endonuclease. 

5. E. coli cells were transformed with the restriction endonuclease- 
treated plasmid DNA preparation. 

The rationale for this procedure is that the clones that carry and express 
the target modification enzyme will produce plasmid DNA that is resistant 
to digestion by the target restriction endonuclease because their DNA will 
be methylated at the recognition sites. For example, following transforma¬ 
tion of E. coli by a pBR322-HindIII clone bank of Desulfovibrio desulfuricans 
DNA, plasmid DNA was isolated and digested with the restriction enzyme 
Ddel (Fig. 13.3). Plasmids that encode and express the Ddel modification 
enzyme are not digested by Ddel because the eight Ddel recognition sites 
of pBR322 are methylated. After the Ddel treatment, the remaining plasmid 
mixture is used to transform £. coli. Only intact circular plasmids yield 
transformants, and these carry the gene for a functional Ddel modification 
enzyme. All other plasmids are degraded by the restriction endonuclease. 
The resulting transformants must then be assayed for the Ddel restriction 
enzyme activity to determine which clones have the genes for both the 
modification enzyme and the restriction endonuclease. This strategy is 
effective for any restriction enzyme gene that is physically close to its 
modification enzyme gene—most restriction enzymes are encoded on the 
same operon as their cognate modification enzyme—and is cloned into a 
plasmid vector that has at least one recognition site for the target enzyme. 
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FIGURE 13.3 Scheme for cloning the Ddel modification enzyme. Plasmids with 
cloned D. desulfuricans DNA are used to transform E. coli cells. After the cells are 
grown in liquid culture, the plasmid DNA is isolated and digested with the restric¬ 
tion enzyme Ddel. Plasmids encoding the Ddel modification enzyme will be 
methylated at the Ddel restriction enzyme recognition site and will not be cut. The 
intact plasmid DNA is used to transform E. coli cells. The final transformants are 
assayed for the Ddel restriction enzyme and the corresponding methylase. 


Lipase 

Fatty stains are a persistent problem for the laundry industry. A combina¬ 
tion of high temperature and high alkalinity can effectively emulsify and 
remove many fatty stains. Flowever, these conditions often damage fabrics 
and also require large amounts of energy. On the other hand, the addition 
of lipases that are compatible with the wash conditions, such as the enzyme 
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produced by Pseudomonas alcaligenes, may provide an effective solution to 
this problem. Unfortunately, this enzyme is produced at such low levels 
that it is prohibitively expensive to use in the cleaning of laundry. Moreover, 
researchers have found it extremely difficult to overproduce the enzyme in 
a variety of heterologous hosts, including Bacillus licheniformis, E. coli, 
Streptomyces lividans, Aspergillus niger, and Kluyveromyces lactis. The diffi¬ 
culty in overexpressing the P. alcaligenes lipase may reflect the requirement 
for the simultaneous expression of another gene product that is involved in 
either the secretion or the stabilization of the bacterial lipase. 

To isolate the lipase gene from P. alcaligenes, as well as any other gene 
whose expression was linked to lipase gene expression, the enzyme was 
first purified. The amino acid sequence of the N terminus was determined, 
and an oligonucleotide probe that corresponded to 11 amino acids from 
this region of the protein was synthesized. The oligonucleotide probe was 
used to screen a clone bank of P. alcaligenes DNA, with the result that a 
clone that contained all of the lipase gene and a portion of the additional 
gene was obtained. The DNA fragment encoding the lipase gene and a por¬ 
tion of the additional gene was used as a hybridization probe to screen 
another clone bank, with the result that the rest of the second gene was 
isolated. The two fragments were spliced together (Fig. 13.4), cloned into a 
broad-host-range expression vector, and used to transform P. alcaligenes. 
The lipase structural gene is called lip A, and the second (helper) gene is 
called lipB. When the vector was derived from a low-copy-number plasmid, 
the lipase activities of the transformants were four- to fivefold greater than 
that of the wild type, regardless of the presence or absence of the second 
gene (lipB). However, with a high-copy-number plasmid, the lipase activi¬ 
ties of the transformants were about 20-fold greater than that of the wild 
type in the absence of the lipB gene and approximately 35-fold greater than 
that of the wild type in the presence of the lipB gene. Since the lipase is 
secreted into the growth medium, very little purification should be neces¬ 
sary before using it in laundry detergent. Rather, the cells must be removed, 
and then the growth medium should be concentrated. However, when the 
production of the growth of this recombinant organism was scaled up from 
10 liters to 10,000 liters, considerably less lipase activity was found than 
was expected. This was attributed to the production, and lack of removal, 
of large amounts of carbon dioxide in the growth medium. Modification of 
some of the operating parameters of this large fermenter resulted in a sig¬ 
nificant decrease in the level of dissolved carbon dioxide and decreased 
inhibition of lipase accumulation. This work went a long way toward pro¬ 
viding lipase of sufficient quality and in sufficient quantity for use in 
laundry detergent. 


Small Biological Molecules 

With recombinant DNA technology, it is possible to modify metabolic path¬ 
ways of organisms either by introducing new genes or by altering existing 
ones. The goal is to create an organism with a novel enzymatic activity that 
can convert an existing substrate into a commercial compound that with 
current technology can be produced only by a combination of chemical 
treatments and fermentation steps. Early metabolic-engineering experi¬ 
ments typically modified one or two genes in a biosynthetic pathway. 
However, with the knowledge of the complete DNA sequences of many 
bacterial genomes and the global information on the expression of bacterial 
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FIGURE 13.4 Splicing together portions of the lipase operon. P, PvuII; B, Bell; V, 
EcoRV. Partial genes are indicated by asterisks. 


genes derived from studies that employ microarrays, proteomics, and 
metabolomics, it is possible to develop strategies to improve the yield of 
microbially produced molecules that involve the introduction or modifica¬ 
tion of an entire panel of related genes. 

Synthesis of L-Ascorbic Acid 

L-Ascorbic acid (vitamin C) is currently synthesized commercially by an 
expensive process starting with D-glucose that includes one microbial fer¬ 
mentation step and a number of chemical steps (Fig. 13.5). The last step in 
this process is the acid-catalyzed conversion of 2-keto-L-gulonic acid 
(2-KLG) to L-ascorbic acid. Biochemical studies of the metabolic pathways 
of a number of different microorganisms have shown that it may be pos¬ 
sible to synthesize 2-KLG by a different pathway. For example, some bac¬ 
teria (Acetobacter, Gluconobacter, and Erwinia) can convert glucose to 
2,5-diketo-D-gluconic acid (2,5-DKG), and others (Corynebacterium, 
Brevibacterium, and Arthrobacter) have the enzyme 2,5-DKG reductase, 
which converts 2,5-DKG to 2-KLG. 

The current procedure for synthesizing ascorbic acid could be improved 
by producing 2-KLG from glucose by cofermentation with suitable organ¬ 
isms. Unfortunately, cocultivation has problems of its own. For example, 
the two fermenting organisms might have different temperature and pFI 
optima. The medium requirements and growth rates also might differ in 
such a way that the fermentation conditions are optimal for one organism 
and suboptimal for the other. This situation leads to the eventual "washout" 
(depletion or loss) of one of the organisms. Some of these incompatibilities 
may be overcome by utilizing a tandem fermentation process in which the 
two organisms are cultivated in succession (Fig. 13.6). Of course, this 
approach requires two separate fermentations rather than one, and if the 
organisms have different growth requirements, it is difficult to run the pro¬ 
cess on a continuous basis. Therefore, the best way to convert glucose into 
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FIGURE 13.5 Commercial synthesis of L-ascorbic acid. Except for the microbial con¬ 
version of D-sorbitol to L-sorbose, all the steps are chemical reactions. The microbial 
conversion is carried out by Acetobacter suboxydans, which produces the enzyme 
sorbitol dehydrogenase. 
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2-KLG would be to engineer a single microorganism that carried all of the 
required enzymes. The conversion of D-glucose to 2,5-DKG by Erwinia her- 
bicola includes several enzymatic steps, whereas the transformation of 2,5- 
DKG to 2-KLG by a Corynebacterium sp. requires only one. Consequently, 
the simplest strategy for constructing a single organism that is able to con¬ 
vert D-glucose to 2-KLG is to isolate the 2,5-DKG reductase gene from the 
Corynebacterium sp. and express it in E. herbicola. 

The first step in cloning the 2,5-DKG reductase gene from the 
Corynebacterium sp. involved purifying the enzyme and determining the 
sequence of the first 40 amino acids from the N-terminal end of the mole¬ 
cule. On the basis of the known amino acid sequence, two 43-nucleotide 
DNA hybridization probes, each corresponding to a different portion of the 
protein molecule, were synthesized. Because 71% of the nucleotides in the 
Corynebacterium sp. are either G or C, the probes were designed to include, 
where possible, a G or C in the third position of all codons, thereby mini¬ 
mizing the extent of the mismatch between the probe and the target DNA. 
This approach was taken because at the time that this work was done, 
mixed probes were not readily available. 

A Corynebacterium DNA clone bank was screened with these two 
probes. Any clones that hybridized with only one of the probes were dis¬ 
carded. It was assumed that any DNA that interacted with only one probe 
was probably not the target DNA. A clone that hybridized with both probes 
was isolated and then sequenced; it contained the 2,5-DKG reductase gene. 
The DNA sequences that were upstream of the ATG start signal were 
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FIGURE 13.6 Biological production of 2-KLG. Erwinia has a set of three enzymes that 
synthesize 2,5-DKG from D-glucose. Corynebacterium has an enzyme that converts 
2,5-DKG to 2-KLG. Thus, 2-KLG, which is the immediate precursor of L-ascorbic 
acid, can be produced from D-glucose either by cocultivating these two microorgan¬ 
isms or by genetically engineering Erwinia to express the enzyme from 
Corynebacterium, which converts 2,5-DKG to 2-KLG. 


deleted and replaced with transcriptional and translational signals that 
function in £. coli, because the regulatory sequences from gram-positive 
microorganisms, such as Corynebacterium spp., are not efficiently utilized 
by E. coli. This construct expressed 2,5-DKG reductase activity in E. coli and 
subsequently was subcloned onto a broad-host-range vector, which was 
used to transform £. herbicola, which is able to use E. coli transcriptional and 
translational signals. 

The transformed Erwinia cells were able to convert D-glucose directly 
to 2-KLG. The endogenous Erwinia enzymes, localized in the inner mem¬ 
brane of the bacterium, converted glucose to 2,5-DKG, and the cloned 2,5- 
DKG reductase, localized in the cytoplasm, catalyzed the conversion of 
2,5-DKG to 2-KLG (Fig. 13.7). Thus, by genetic manipulation, the metabolic 
capabilities of two very dissimilar microorganisms were combined into one 
organism, which was able to produce the end product of the engineered 
metabolic pathway. This recombinant organism should be useful as a 
source of 2-KLG for the production of L-ascorbic acid, thereby replacing the 
first three steps of the currently used process (Fig. 13.5). 

The commercial utility of the cloned 2,5-DKG reductase gene product 
might be improved by replacing certain amino acids of the enzyme to create 
mutants with increased catalytic activity and enhanced thermal stability. 
When the 2,5-DKG reductase gene was first isolated, the amino acid resi¬ 
dues that contributed to the active site of this enzyme were not known. 
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FIGURE 13.7 Conversion of D-glucose to 2-KLG by recombinant E. herbicola. The cel¬ 
lular locations of all of the component enzymes are indicated. The enzymes are 
denoted with the letter E and are numbered consecutively. Enzyme E4 is the intro¬ 
duced 2,5-DKG reductase. The major intermediates in the pathway are named. IM 
and OM denote the inner and outer membranes, respectively. 


However, from the primary amino acid sequence, computer modeling pre¬ 
dicted an enzyme structure with an eight-stranded a/(3 barrel (Fig. 13.8). 
This structure consisted of eight twisted parallel [3-strands arranged close 
together, surrounded by eight a-helices that were joined to the [3-strands 
through loops of various lengths. This folding pattern had previously been 
observed for 17 other enzymes whose crystal structures were known. By 
comparison with the structures of these other proteins, three of the loops 
that might be involved in substrate binding were identified (Fig. 13.8). 
Using oligonucleotide-directed mutagenesis, 12 different mutants, each 
with a single amino acid change in one of these loops, were constructed. Of 
the 12 mutants, 11 produced enzymes with a lower 2,5-DKG reductase spe¬ 
cific activity than that of the native form of the enzyme. The 12th mutant, in 
which amino acid residue 192 was changed from glutamine to arginine, had 
approximately twice the activity of the native enzyme. Kinetic studies 
revealed that this increase in activity resulted from a 1.8-fold increase in the 
maximal rate of the enzyme-catalyzed reaction (V max ) and a 25% decrease in 
the Michaelis constant (K m ) of the enzyme-catalyzed reaction. 
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Substrate-binding loops 



FIGURE 13.8 Predicted structure of 2,5-DKG reductase. The solid arrows indicate 
(3-stranded regions, the solid bars are a-helical regions, and the circles are amino 
acid residues either at the N or C terminus or involved in loops connecting the 
(3-strands to the a-helices. The three loops that may be involved in substrate binding 
are indicated. Amino acid residue 192 is shown in yellow. 


The reaction catalyzed by 2,5-DKG reductase utilizes reduced nicotin¬ 
amide adenine dinucleotide phosphate (NADPH) as a cofactor. However, 
the cellular concentration of reduced nicotinamide adenine dinucleotide 
(NADH) is usually about 10-fold greater than that of NADPH, while the 
financial cost of NADPH is about 10 times higher than that of NADH. To 
lower the cost of the bacterial production of ascorbic acid, it would be ben¬ 
eficial to engineer a version of 2,5-DKG reductase that used NADH instead 
of NADPH. The only structural difference between NADH and NADPH is 
the presence or absence of a phosphate group attached to the 2' site of the 
adenine moiety. From the three-dimensional structure of 2,5-DKG reductase 
complexed with NADPH, it appears that 5 amino acid residues interact 
directly with the 2' phosphate residue of NADPH. Using cassette mutagen¬ 
esis (Fig. 13.9), a total of 40 different mutants of this enzyme were con¬ 
structed; in each constructed mutant, 1 of the 5 amino acid residues that 
computer models suggested interacted with the 2' phosphate residue was 
changed to a different amino acid. Following the expression, purification, 
and kinetic characterization of the 40 mutants in E. coli, it was observed that 
changing three of the five selected amino acids resulted in increases in 2,5- 
DKG reductase activity with NADH as the cofactor. In the best case, when 
the arginine residue at position 238 was changed to histidine, there was a 
sevenfold improvement over the wild type with NADH as a cofactor. 
Moreover, after two amino acid alterations were combined in one protein, 
an enzyme that showed even more activity with NADH included a change 
of the lysine residue at position 232 to glycine, as well as the change of 
arginine at position 238 to histidine. Also, when the best NADH-active 
mutant was combined with a double mutant which increased the binding 
of the substrate, even further improvements in activity with NADH were 
observed. The activity of the enzyme isolated from the final construct was 
72 times higher than the activity of the wild-type enzyme. It now remains 

















FIGURE 13.9 Mutagenesis of the Corynebacterium sp. 2,5-DKG reductase gene. The 
isolated gene, on a plasmid, was digested to remove a 57-bp DNA fragment. Then, 
chemically synthesized 57-bp DNA fragments that contained an alteration in the 
DNA sequence that coded for one of the amino acids to be changed was spliced into 
the gene in place of the original fragment. In addition, each 57-bp fragment (or 
mutagenesis cassette) contained a silent mutation that abolished a PstI site but did 
not alter the amino acid sequence. The silent mutation facilitated screening against 
the wild type. 


to be seen whether this engineered enzyme can be used as the basis for the 
economically efficient biological synthesis of ascorbic acid. 

Microbial Synthesis of Indigo 

A number of bacteria, most notably Pseudomonas spp., have the ability to 
use a variety of organic compounds, such as naphthalene, toluene, xylene, 
and phenol, as their sole carbon source. In many instances, the genes 
encoding the enzymes for the degradation of these organic compounds are 
located on large, naturally occurring plasmids (typically 50 to 200 kb in 
size). Research on these bacteria often requires detailed genetic and bio¬ 
chemical studies, so that the genes encoding enzymes catalyzing important 
steps in the pathway can be targeted for modification. Occasionally, despite 
the original purpose of a study, an unexpected but useful discovery is 
made. For example, plasmid NAFI7 has two separate and distinct operons 
that allow pseudomonads that contain this plasmid to grow on naphtha¬ 
lene as the sole carbon source. As a first step toward characterizing these 
genes, NAFI7 plasmid DNA was digested with Hindlll and the fragments 
were ligated with linear Flindlll-digested plasmid pBR322. This clone bank 
was introduced into E. coli cells, and transformants were selected on the 
basis of their resistance to ampicillin and sensitivity to tetracycline. All 
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transformants were then tested for the production of nonvolatile metabo¬ 
lites that might result from the hydrolysis of radiolabeled naphthalene. 

During the characterization of one of the transformants that had a 
10.5-kb insert and could convert naphthalene to salicylic acid, it was 
observed that when the minimal growth medium contained tryptophan, it 
turned blue. A thorough analysis of the blue color revealed that the trans¬ 
formed E. coli cells were synthesizing the dye indigo. This synthesis was 
achieved in four steps (Fig. 13.10): 

1. Conversion of tryptophan in the growth medium to indole by the 
enzyme tryptophanase, which is produced by the E. coli host cell 

2. Oxidation of indole to ris-indole-2,3-dihydrodiol by naphthalene 
dioxygenase, which is encoded by the DNA that was cloned from 
the NAH7 plasmid 

3. Spontaneous elimination of water 

4. Air oxidation to form indigo 

Thus, the combination of enzymes from two different pathways and 
two different organisms resulted in the synthesis of an unexpected com¬ 
pound, the dye indigo. In addition, introduction of the gene for the enzyme 
xylene oxidase, which is encoded in the TOL plasmid, can convert trypto¬ 
phan to indoxyl, which then spontaneously oxidizes to indigo (Fig. 13.10). 


FIGURE 13.10 Indigo biosynthesis from tryptophan in genetically engineered E. coli. 
Tryptophanase is an E. coli enzyme. In pathway A, the naphthalene dioxygenase is 
derived from the NAH plasmid; in pathway B, the xylene oxidase is from the TOL 
plasmid. E. coli transformants that synthesize indigo contain either pathway A or B, 
but not both pathways. 
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Indigo, a commercially important blue pigment that is used to dye both 
cotton and wool, was originally isolated from plants but is currently syn¬ 
thesized chemically At present, approximately 13 x 10 6 kg of indigo, worth 
more than $250 million, is produced every year. Indigo, the coloring agent 
in blue jeans, is the largest-selling dye in the world. The ability to produce 
indigo from bacteria opens the possibility of developing an efficient and 
economical commercial microbial process for its production. This process 
would avoid the use of hazardous compounds, such as aniline, formalde¬ 
hyde, and cyanide, which are needed in the chemical synthesis of indigo. 

Despite the environmentally friendly nature of engineering bacteria 
to make indigo, at the present time, the chemical synthesis of the dye is 
less expensive, thwarting commercial schemes to produce indigo biologi¬ 
cally. One approach to improving indigo production in E. coli involves 
engineering the host strain to overproduce tryptophan, the raw material 
for the process. However, despite improved efficiency when a tryptophan- 
overproducing E. coli strain is used, the overall process requires addi¬ 
tional improvement before it is economically competitive with chemical 
synthesis. 

Synthesis of Amino Acids 

Amino acids are used extensively in the food industry as flavor enhancers, 
antioxidants, and nutritional supplements; in agriculture as feed additives; 
in medicine in infusion solutions for postoperative treatment; and in the 
chemical industry as starting materials for the manufacture of polymers and 
cosmetics (Table 13.1). It is estimated that more than 2.5 million tons of 
amino acids, worth more than $9 billion, were produced worldwide in 2008. 
L-Glutamic acid, which is used in the manufacture of the flavor enhancer 
monosodium glutamate, makes up around half of the total volume. 


TABLE 13.1 Commercial applications of amino acids 


Amino acid 

Application(s) 

Alanine 

Flavor enhancer 

Arginine 

Therapy for liver diseases 

Aspartic acid 

Flavor enhancer; sweetener synthesis 

Asparagine 

Diuretic 

Cysteine 

Bread production; therapy for bronchitis; antioxidant 

Glutamic acid 

Flavor enhancer 

Glutamine 

Therapy for ulcers 

Glycine 

Sweetener synthesis 

Histidine 

Therapy for ulcers; antioxidant 

Isoleucine 

Intravenous solutions 

Leucine 

Intravenous solutions 

Lysine 

Feed additive; food additive 

Methionine 

Feed additive 

Phenylalanine 

Infusions; sweetener synthesis 

Proline 

Intravenous solutions 

Serine 

Cosmetics 

Threonine 

Feed additive 

Tryptophan 

Intravenous solutions; antioxidant 

Tyrosine 

Intravenous solutions; precursor for l-DOPA 

Valine 

Intravenous solutions 
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For the most part, amino acids are commercially produced either by 
extraction from protein hydrolysates or as fermentation products of either 
Corynebacterium or Brevibacterium spp., which are both nonsporulating 
gram-positive soil bacteria that secrete large amounts of amino acids into 
the growth medium. Traditionally, the productivity of these organisms has 
been improved by mutagenesis and subsequent screening for strains that 
overproduce certain amino acids. Flowever, this way of developing new 
strains is slow and inefficient. By using detailed biochemical information 
about the enzymes that are involved in the biosynthesis of various com¬ 
mercially important amino acids, it is more expeditious to isolate and 
manipulate the specific genes encoding the key components of a particular 
pathway. However, this type of genetic engineering is not a simple matter. 
For example, the pathway(s) leading to the biosynthesis of certain amino 
acids contains a number of different enzymes, each of which may be either 
activated or inhibited by a number of metabolites present in the cell. This 
makes it difficult to know which enzyme(s) to manipulate in order to 
enhance the yield of the end product. 

Because most broad-host-range plasmid vectors replicate only in gram¬ 
negative organisms, it is necessary to construct expression vectors that are 
specifically suited for Corynebacterium and Brevibacterium spp. Such cloning 
vehicles might take the form of E. coli-Corynebacterium shuttle vectors. The 
E. coli portion of the plasmid could encode resistance to the antibiotic tet¬ 
racycline, chloramphenicol, or kanamycin. Because both £. coli and 
Corynebacterium spp. are susceptible to these antibiotics, they could be used 
as selectable markers in both organisms. 

An efficient transformation protocol for Corynebacterium glutamicum, 
the species of Corynebacterium often used in these experiments, is still 
needed. Also, many C. glutamicum genes are not efficiently expressed in E. 
coli. Therefore, for selection schemes that depend on gene expression (e.g., 
complementation), the entire clone bank should be transformed into C. 
glutamicum. Unfortunately, the transformation frequency is very low when 
DNA is introduced into C. glutamicum by either direct transformation or 
electroporation. However, effective transformation of C. glutamicum is 
achieved when foreign DNA is introduced by conjugation or after the for¬ 
mation of protoplasts, i.e., after removal of the cell wall with lysozyme. The 
transformation of protoplasts is made possible by adding polyethylene 
glycol to facilitate the uptake of exogenous plasmid DNA. 

Some progress has been made in increasing the amino acid output of 
C. glutamicum. For example, the synthesis of the essential amino acid tryp¬ 
tophan was enhanced by introducing into wild-type C. glutamicum cells a 
second copy of the gene encoding anthranilate synthetase, which is the 
rate-limiting enzyme in the normal tryptophan biosynthetic pathway (Fig. 
13.11). The following protocol describes one way to isolate the anthranilate 
synthetase gene. 

1. A library of Brevibacterium flavum chromosomal DNA was cloned 
into a C. glutamicum-E. coli shuttle vector and introduced into a 
mutant strain of C. glutamicum that produced no active anthranilic 
acid synthetase. 

2. The mutant strain was unable to grow on minimal medium unless 
anthranilic acid was added; therefore, transformants were selected 
by their ability to grow in the absence of anthranilic acid. 

3. The vector carrying the anthranilic acid synthetase gene was then 
transferred to a wild-type strain of C. glutamicum. 
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FIGURE 13.11 Simplified pathway and regulation of tryptophan biosynthesis in C. 
glutamicum. The abbreviations DS, ANS, and PRT represent the enzymes 3-deoxy- 
D-arabino-heptulosonate 7-phosphate synthase, anthranilate synthase, and anthra- 
nilate phosphoribosyltransferase, respectively. The solid lines represent the synthetic 
pathways. The dashed lines denote feedback inhibition. Indole is produced in a side 
reaction and is converted into tryptophan by the action of tryptophan synthase p. 


The amounts of tryptophan produced in the mutant and wild-type C. 
glutamicum strains—one without and one with the vector carrying the 
cloned anthranilic acid synthetase gene—were measured (Table 13.2). The 
cloned gene did indeed restore most of the capacity of the mutant to syn¬ 
thesize tryptophan. Moreover, the effect of adding this gene to wild-type C. 
glutamicum was much more dramatic, with the synthesis of tryptophan 
being increased by approximately 130%. This level of overproduction 
reflects more efficient utilization of available precursor material. Thus, by 
cloning an additional gene of an amino acid biosynthesis pathway into an 
organism, it was possible to generate much more of the end product. An 
even higher level of tryptophan production was achieved when modified 
genes for the three key enzymes, 3-deoxy-D-arabino-heptulosonate 7-phos¬ 
phate synthase, anthranilate synthase, and anthranilate phosphoribosyl¬ 
transferase, were introduced into C. glutamicum cells (Fig. 13.11). The genes 
encoding these enzymes were mutagenized to render them insensitive to 
inhibition by the end product (feedback inhibition). 
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TABLE 13.2 Production of tryptophan under standard growth 
conditions by certain strains of C. glutamicum 


Strain 

Tryptophan concentration (mg/mL) 

Mutant 

0.00 

Mutant with vector 

0.34 

Wild type 

0.48 

Wild type with vector 

1.12 


Adapted from Ozaki et al., U.S. patent 4,874,698,1989. 


An alternative to producing amino acids in Corynebacterium and 
Brevibacterium spp. is to produce them in E. coli, where both metabolic path¬ 
ways and procedures for genetic manipulation are much better character¬ 
ized. Its relative ease of manipulation makes £. coli an attractive host 
organism for this sort of metabolic engineering. 

L-Cysteine, which is one of the most important amino acids in the phar¬ 
maceutical, food, and cosmetics industries, has traditionally been obtained 
by extracting it from acid hydrolysates of human hair and animal feathers. 
While a number of microorganisms are able to synthesize L-cysteine, high 
levels cannot be synthesized from glucose because L-cysteine feedback 
inhibits the enzyme serine (Ser) acetyltransferase, which catalyzes one of 
the steps in the biosynthesis of L-cysteine (Fig. 13.12). To try to remedy this 
situation, the methionine residue at position 256 in the E. coli serine acetyl¬ 
transferase amino acid sequence was systematically changed to each of the 
other 19 amino acids. When E. coli was transformed with plasmids carrying 
cysE genes encoding altered serine acetyltransferase, several transformants 
with altered forms of serine acetyltransferase produced higher levels of 
L-cysteine than did the wild-type enzyme. Next, plasmids encoding the 
most effective serine acetyltransferase derivatives were used to transform 
a strain of E. coli that did not degrade L-cysteine. To improve on this modest 
success, complementary DNAs (cDNAs) encoding feedback inhibition- 
insensitive serine acetyltransferases from the plant Ambidopsis thaliana 
were expressed in a serine acetyltransferase-deficient and non-L-cysteine- 
utilizing £. coli strain. The transformants included several different strate¬ 
gies and produced a much higher level of L-cysteine than had previously 
been possible by merely manipulating the £. coli serine acetyltransferase 
gene. Whether the expression of this protein can be further improved and 
how this affects the synthesis of L-cysteine remain to be determined. 

To rationally engineer a bacterium to modify its metabolism so that it 
overproduces a particular amino acid, it is essential to understand how 
many of the metabolic pathways of the bacterium are interrelated and 
regulated. Subsequent systematic reengineering of the metabolism of the 
organism can then be performed to achieve a much higher yield than 
would ever be possible by merely modifying one or two genes in the imme¬ 
diate biosynthetic pathway of a particular amino acid. At each stage in the 
development of the reengineered bacterium, it is possible to monitor the 
levels of a wide range of transcripts (using microarrays) and metabolites 
(metabolomics). Moreover, the mRNA and metabolite expression data 
allow the construction of a detailed computer model that can be used to 
predict the effects of other possible genetic manipulations on the produc¬ 
tion of the target amino acid. While this approach is relatively new, such 


FIGURE 13.12 Biosynthesis of L-cysteine 
from L-serine and acetyl-CoA. The 
dashed arrow indicates feedback inhi¬ 
bition. 
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rational genetic engineering may be used to engineer bacterial strains for a 
variety of purposes, in addition to the production of specific amino acids. 

To rationally engineer E. coli to overproduce L-valine, it was necessary 
to introduce a large number of mutations into the E. coli genome (Fig. 
13.13). The flux from glucose to L-valine was first improved by abolishing 
feedback inhibition by L-valine of the enzyme that converts pyruvate to 
2-acetolactate. Then, the subunits of this enzyme were overexpressed by 
replacing the endogenous promoter with a strong constitutive promoter. 
Next, the carbon flux toward L-valine was increased. The gene for the 
enzyme that converts L-threonine to 2-ketobutyrate was knocked out, as 
were the genes encoding enzymes that convert 2-ketoisovalerate to either 
pantothenate or L-leucine. In addition, some of the genes that encode 
enzymes that convert pyruvate to 2-acetolactate were amplified. 
Subsequently, the expression levels of the genes encoding enzymes that 
convert 2-acetolactate to L-valine via 2,3-dihydroxyisovalerate and 2-keto¬ 
isovalerate were all increased by modifying the regulatory regions of these 
genes. In addition, the hp gene, encoding the leucine-responsive regulatory 
protein, was amplified, since some of genes in the L-valine pathway were 
under the positive transcriptional control of this protein. Finally, the 
expression of the two genes that encode proteins responsible for the export 
of L-valine from the bacterial cell was amplified, since it was thought that 
a low level of these proteins was limiting to L-valine production. Following 
the extensive metabolic engineering of E. coli, the final strain was able to 


FIGURE 13.13 Simplified overview of the production of L-valine in rationally engi¬ 
neered E. coli. The red X's indicate pathways that were knocked out, the green 
arrows indicate pathways that were upregulated, and the dashed blue line indicates 
feedback inhibition. Some arrows represent a single enzymatic step, while other 
arrows represent several enzymatic steps. The interaction of glucose with other 
pathways, the export of L-valine from the cell, and several more complex regulatory 
steps are not shown. 
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produce 0.378 g of L-valine per g of glucose, which is higher than industrial 
strains of C. glutamicum that have been developed by repeated rounds of 
random mutagenesis and selection. Moreover, researchers believe that the 
rationally engineered strain can be further modified with resulting 
increases in L-valine productivity. Finally, it is important to note that the 
approach used here should in principle be useful for the overproduction of 
a wide range of metabolites from bacteria. 

Microbial Synthesis of Lycopene 

Lycopene (C 40 H 56 ) is a bright red carotenoid pigment (Fig. 13.14) that is 
commonly found in tomatoes and other fruits, including watermelons, 
pink grapefruit, pink guavas, papayas, and rose hips. It is a powerful anti¬ 
oxidant that has been suggested to decrease low-density lipoprotein oxida¬ 
tion in humans and thereby to lower the risk of atherosclerosis and 
coronary heart disease. In addition, lycopene, and several carotenoids 
derived from lycopene, have been proposed as treatments for some types 
of cancer. It would be useful if lycopene (and carotenoids produced from 
lycopene) could be produced in microorganisms so that the large-scale- 
processing problems that exist when lycopene is isolated from tomatoes 
might be avoided. For the production of lycopene in E. coli, the 2-C-methyl- 
D-erythritol 4-phosphate pathway provides the precursors isopentenyl 
diphosphate and dimethylallyl diphosphate (Fig. 13.15). Flowever, by 
introducing Saccharomyces cerevisiae genes encoding the mevalonate 
pathway under the control of E. coli transcriptional promoters, the levels of 
these precursor molecules are increased dramatically. By subsequently 
introducing Pantoea agglomerans (a gram-negative bacterium) genes 
encoding the biosynthesis of lycopene from the above-mentioned precursor 
metabolites, a relatively high level of lycopene can be produced (Fig. 13.15). 
With this engineered E. coli strain, which contains two additional biosyn¬ 
thetic pathways, it was possible to obtain approximately 60 mg of lycopene 
per liter of bacterial culture. While additional optimization of this system 
is still necessary before it can be the basis of a commercial system for lyco¬ 
pene production, this work is an important step in the development of such 
a system. 

Increasing Succinic Acid Production 

Succinic acid (succinate) is a dicarboxylic acid that is a component of the 
citric acid cycle; it is formed from fumarate and reacts to form succinyl- 
coenzyme A (CoA) (Fig. 13.16). At room temperature, pure succinic acid is 
a colorless and odorless solid that is moderately soluble in water. Succinic 
acid is used as a flavoring for foods and beverages and in the production 
of dyes, perfumes, lacquers, resins, and a variety of medicines. It is cur¬ 
rently synthesized by the catalytic hydrogenation of malic acid or its anhy¬ 
dride; however, there is increasing interest in the production of succinic 
acid from renewable sources by microbial fermentation. 


FIGURE 13.14 The chemical structure of lycopene. 


520 CHAPTER 13 




\ / 


FIGURE 13.15 Overview of the production of lycopene in an engineered strain of E. 
coli. The yellow metabolites are normally produced in E. coli. The portion of the 
pathway highlighted in blue represents the enzymes encoded by S. cerevisiae genes, 
as well as some of the metabolites that they produce. The pathway highlighted in 
green represents enzymes encoded by P. agglomerans and the metabolites that they 
produce. Introduced enzymes are depicted, adjacent to the arrows, by asterisks. 


Although a very large number of bacteria synthesize succinic acid, 
only a few of these organisms, including Anaerobiospirillum succiniprod- 
ucens, Actinobacillus succinogenes, and Mannheimia succiniproducens, pro¬ 
duce the metabolite at high levels. Unfortunately, at the same time that 
these anaerobic bacteria produce succinic acid, they also produce, and 
excrete, significant amounts of acetic, formic, and lactic acids. This not 
only reduces the yield of succinic acid, it also makes the purification pro¬ 
cess more difficult and costly. To increase the amount of succinic acid 
produced by M. succiniproducens, genes that were known to be involved in 
the synthesis of acetic, formic, and lactic acids from pyruvic acid were 
sequentially disrupted (Fig. 13.16), and each mutant was tested for the 
ability to synthesize succinic (and acetic, formic, and lactic) acid. Using 
this strategy, it was possible to engineer a strain, under the anaerobic con¬ 
ditions that are normal for the bacterium, to produce 13.5 g of succinic acid 
per liter of culture compared to 10.5 g per liter for the wild-type bacterium. 
At the same time, the production of formic and lactic acids was completely 
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FIGURE 13.16 Overview of the metabolic pathways that lead to the formation of suc¬ 
cinic, formic, lactic, and acetic acids under anaerobic conditions in M. succiniprod- 
ucens. The metabolites indicated in green highlight the production of succinic acid 
from glucose. The metabolites shown in yellow depict the synthesis of unwanted 
side products. The four genes in red (i.e., IdhA, pflB, pta, and ackA) were mutated to 
decrease the flux through this portion of the pathway. The gene in green (maeB) may 
be overproduced in an effort to decrease pyruvic acid and increase succinic acid. 
Idh, lactate dehydrogenase; pfl, pyruvate formate lyase; pta, phosphotransacetylase; 
ack, acetate kinase; mae, malic enzyme. 


abolished and the amount of acetic acid was significantly reduced (Fig. 
13.17). In addition, when the cells that contained four separate mutations 
were grown in a fed-batch mode (see chapter 17), the yield of succinic acid 
increased to 52.4 g per liter while the amount of pyruvate was only 0.8 g 
per liter. Although this modified strain secreted pyruvic acid into the 
medium, it is technically simpler to remove pyruvic acid than acetic, 
formic, or lactic acid. Nevertheless, it is hoped that additional metabolic 
engineering (e.g., by overproducing MaeB) will both increase the amount 
of succinic acid and decrease the level of pyruvic acid that this strain pro¬ 
duces so that the modified bacterium can be used as a biological "factory" 
for succinic acid synthesis. 


Antibiotics 

Since the discovery of penicillin in the late 1920s, more than 12,000 antibi¬ 
otics with different specificities and a variety of modes of action have been 
isolated from various microorganisms. The universal use of antibiotics to 
treat bacterial diseases has resulted in an enormous improvement in human 


































522 CHAPTER 13 



FIGURE 13.17 Formation of succinic, formic, lactic, acetic, and pyruvic acids by wild- 
type and mutant strains of M. succiniproducens. Bars: A, wild type; B, IdhA mutant; 
C, IdhA and pflB mutant; D, IdhA, pflB, pta, and ackA mutant. 


health and has undoubtedly saved millions of lives. The majority of the 
most important antibiotics have been isolated from the gram-positive soil 
bacterium Streptomyces, although fungi and other gram-positive and gram¬ 
negative bacteria are also sources of antibiotics (Table 13.3). Worldwide, 
over 100,000 tons of antibiotics is produced per year, with annual gross 
sales of about $35 billion, including antibiotics used in animal feed and as 
animal growth promoters. The antibiotic market is driven by the sales of 
four leading drug classes: the cephalosporins (27%), macrolides (20%), qui- 
nolones (17%), and penicillins (17%). Together, these four drug classes 
account for more than 80% of global antibacterial sales. 

An estimated 200 to 300 new antibiotics are discovered each year, pri¬ 
marily through labor-intensive research programs in which many thou¬ 
sands of different microorganisms are screened to find those that produce 
unique antibiotics. However, with the high costs of development and clin¬ 
ical testing, only the compounds that show significant therapeutic and 
economic promise are marketed. Therefore, only about 1 to 2% of newly 
discovered antibiotics are added annually to the disease-fighting arsenal. 
In fact, the pharmaceutical industry has been reluctant to invest in research 
and development in this area, and many companies have either abandoned 
or scaled down their efforts since 1999. In addition, to date, nearly all of the 
genetic improvements to industrially important antibiotic-producing 
strains have been achieved by the use of classical mutagenesis and selec¬ 
tion. While the yields of antibiotics from many strains have been signifi- 


TABLE 13.3 Some of the most common microbially synthesized antibiotics 


Amikacin sulfate 

Cefotaxime 

Chlortetracycline 

Kanamycin sulfate 

Streptomycin sulfate 

Amoxicillin 

Cefoxitin 

Clarithromycin 

Lincomycin HC1 

Teicoplanin 

Ampicillin 

Cefpodoxime proxetil 

Clindamycin 

Methicillin 

Tetracycline HC1 

Azithromycin 

Ceftazidime 

Erythromycin A 

Oxytetracycline 

Vancomycin HC1 

Benzylpenicillin 

Ceftriaxone 

Flomoxef 

Phenoxymethylpenicillin 


Cefaclor 

Cefuroxime 

Gentamicin sulfate 

Rifampin 


Cefixime 

Cephalexin 

Imipenem 

Spiramycin 
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cantly improved—the original penicillin-producing fungus isolated by 
Alexander Fleming yielded 2 units per milliliter of culture, while the strains 
used today synthesize approximately 70,000 units per milliliter of culture— 
this yield improvement took many years and required the use of consider¬ 
able manpower and financial resources. Recombinant DNA technology can 
have a positive impact on this endeavor in two ways. First, the technology 
can be used to develop new, structurally unique antibiotics with increased 
activities against selected targets and decreased side effects. Second, genetic 
manipulation can be used to relatively rapidly and inexpensively enhance 
yields and hence lower the cost of production of existing antibiotics. 

For the genetic manipulation of Streptomyces, it is essential that it can 
be transformed and that the transformed cells can be readily selected. 
However, unlike £. coli, Streptomyces strains do not exist as individual cells 
but as extended aggregates called mycelial filaments. The cell wall must be 
removed to release individual cells (protoplasts) before DNA transforma¬ 
tion (Fig. 13.18). Without this step, it would not be possible to distinguish 
transformed from nontransformed cells, because visible colonies on a solid 
medium would each have started from a cell aggregate rather than from an 
individual cell. Thus, colonies that grew in the presence of a selective anti¬ 
biotic would contain a mixture of transformed and nontransformed cells. 
However, as a consequence of protoplast formation prior to transforma¬ 
tion, all colonies that grow in the presence of a selective antibiotic contain 
only transformed cells. The uptake of plasmid DNA into Streptomyces pro¬ 
toplasts is enhanced by polyethylene glycol. Following transformation, the 
protoplasts are first plated onto a solid medium to enable the cell walls to 
regenerate and are then overlaid with a selective medium that often con¬ 
tains either neomycin or thiostrepton, both of which act as selection agents 
for transformed cells. 

Cloning Antibiotic Biosynthesis Genes 

The biosynthesis of an antibiotic may include 10 to 30 separate enzyme- 
catalyzed steps, so cloning all the genes for the synthesis of a particular 
antibiotic is not an easy task. One strategy for isolating the complete set of 
antibiotic biosynthesis genes consists of transforming one or more mutant 
strains that are unable to synthesize the antibiotic with DNA from a clone 
bank constructed from wild-type chromosomal DNA. Following the intro¬ 
duction of the clone bank DNA into mutant cells, transformants are 
screened for the ability to produce the antibiotic. Then, the plasmid DNA 
from the clone that supplies a functional gene and gene product, i.e., com¬ 
plements a mutant strain, is used as a DNA hybridization probe to screen 
another clone bank of wild-type chromosomal DNA (i.e., one in which the 
average-size fragment is around 10 kb) to isolate clones with regions that 
overlap the probe sequence. In this way, DNA segments that are adjacent 
to and usually bigger than the initial complementing DNA can be identi¬ 
fied and cloned. A complete gene cluster can be reconstructed from the 
overlapping clones. If the antibiotic biosynthesis genes are clustered at a 
single site on the chromosomal DNA, the genes that are adjacent to the 
complementing gene are also likely to be involved in the biosynthesis of 
the target antibiotic. However, if the antibiotic biosynthesis genes are scat¬ 
tered in small clusters at different chromosomal locations, it is necessary to 
have at least one mutant per gene cluster to obtain a DNA clone that can be 
used to identify the rest of the genes in the cluster. 
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FIGURE 13.18 Schematic representation of DNA transformation and selection of 
transformants of Streptomyces strains. The pink circles represent transformed cells, 
and the green circles represent nontransformed cells. PEG, polyethylene glycol. 


The complementation approach has been used to isolate some of the 
genes for the biosynthesis of the antibiotic undecylprodigiosin from 
Streptomyces coelicolor A3 (Fig. 13.19). In this case, the complementation 
assay is simple and entails scoring the color of the colonies. Colonies of 
wild-type organisms are red because of the presence of the antibiotic, and 
mutant colonies are cream colored. Complementation produces a red 
colony (Fig. 13.20). 
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FIGURE 13.19 Chemical structure of the red antibiotic undecylprodigiosin. 


In addition to cloning antibiotic biosynthesis genes by complementa¬ 
tion, more direct strategies can be employed. One or more of the key 
enzymes in a biosynthetic pathway can be identified through either 
genetic or biochemical studies and then purified. The N-terminal amino 
acid sequence of the enzyme can then be determined, and with this infor¬ 
mation, oligodeoxyribonucleotide probes for the gene can be prepared. 
This approach has been used to isolate the gene for isopenicillin N syn¬ 
thetase from Penicillium chrysogenum. This enzyme catalyzes the oxidative 


FIGURE 13.20 Method for cloning genes involved in the biosynthesis of the antibiotic 
undecylprodigiosin. Chromosomal DNA from wild-type antibiotic-producing cells 
is spliced into a Streptomyces cloning vector. The clone bank is used to transform a 
noncolored (i.e., non-antibiotic-producing) mutant of the wild type. Red transfor¬ 
mants (in which the mutant has been complemented) are selected, and the plasmid 
DNA insert is characterized. 
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condensation of the compound 8-(L-a-aminodipyl)-L-cysteinyl-D-valine to 
isopenicillin N, a key intermediate in the biosynthesis of penicillins, 
cephalosporins, and cephamycins (Fig. 13.21). 

Despite the difficulty, there are a number of examples of the cloning 
and transfer of large fragments of DNA encoding entire antibiotic biosyn¬ 
thetic pathways. In these cases, it is usually necessary to use a vector that 
can accept and maintain pieces of DNA as large as 100 kb. For this purpose, 
researchers have employed bacterial artificial chromosomes that have been 
engineered to replicate autonomously in E. coli and, when they are intro¬ 
duced into Streptomyces, to integrate into the chromosome. 

Modulating Gene Expression in Streptomycetes 

To improve the productivity of antibiotics produced by Streptomyces spp., it 
is desirable to utilize a regulatory expression system that can suppress the 
expression of the target gene(s) until the culture reaches a high cell density. 
It would also be beneficial if the system could be induced simply and inex¬ 
pensively and if it could function in a range of different Streptomyces spp. 
One regulatable, high-expression Streptomyces system that was recently 


FIGURE 13.21 Biosynthetic pathway for penicillins and cephalosporins in P. chry- 
sogenum. Isopenicillin N synthetase catalyzes the synthesis of isopenicillin N from 
D-(L-a-aminoadipyl)-L-cysteinyl-D-valine. Isopenicillin N is a precursor in the syn¬ 
thesis of penicillin G, penicillin N, and cephalosporin C. 
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developed utilizes the nitrilase operon from the actinomycete Rhodococcus 
rhodochrous. In this bacterium, the expression of the enzyme nitrilase 
(encoded by the nitA gene under the control of the nit A promoter) is posi¬ 
tively regulated by the protein NitR (encoded by nitR, which is present in 
the same operon as nitA). NitR forms a complex with the inducer, 
e-caprolactam, before the complex binds to the nitA promoter, activating the 
synthesis of both proteins, NitA and NitR. However, to use this system as 
part of a high-expression vector, a transcription terminator was placed 
upstream of the nitA promoter (to prevent transcriptional read-through 
from other genes (Fig. 13.22); a multiple cloning site for inserting target 
DNA was placed downstream of the nitA promoter; downstream of the 
multiple cloning site (and the target gene), the synthetic operon ends with 
another transcription terminator; and finally, the vector contains the nitR 
gene under the control of the nitA promoter in a separate operon. When this 
expression vector is used, upon the addition of the inducer e-caprolactam, 
the inducer-NitR protein complex activates the transcription of both the 
target gene(s) and nitR. The increased level of NitR results in a very high 
level of production of the target protein(s). While this system is still at a 
relatively early stage of development, with some target proteins, it has been 
possible to achieve expression levels as high as approximately 40% of all 
soluble protein. Moreover, the system functioned well when it was intro¬ 
duced into S. coelicolor, Streptomyces avermitilis, and Streptomyces grisens, 
bacterial strains that have all previously been used for antibiotic production. 
However, it still remains to be demonstrated that this expression system can 
be used as an effective method for increasing the yield of a commercially 
important antibiotic that is produced in a Streptomyces strain. 

Synthesis of Novel Antibiotics 

New antibiotics with unique properties and specificities may be produced 
by genetic manipulation of the genes involved in the biosynthesis of existing 
antibiotics. In one of the first experiments in which a novel antibiotic was 
produced, researchers began by examining the consequences of placing two 
slightly different antibiotic production pathways into one organism. 

A Streptomyces plasmid (pIJ2303) carrying a 32.5-kb fragment of S. coeli¬ 
color chromosomal DNA contains all of the genes encoding the enzymes 


FIGURE 13.22 Overview of a regulatable expression system for use with Streptomyces 
spp. TT, transcription terminator; MCS, multiple cloning site; p mtA , the nitA gene 
promoter. The nitA gene encodes nitrilase; nitR encodes a positive transcriptional 
regulatory protein; the inducer is e-caprolactam. 
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responsible for the biosynthesis of the antibiotic actinorhodine, starting from 
acetate. This antibiotic is a member of the family of antibiotics called isochro- 
manequinones (Fig. 13.23). The intact plasmid and various subclones car¬ 
rying portions of the 32.5-kb DNA fragment (e.g., pIJ2315) were introduced 
into either Streptomyces sp. strain AM-7161, which produces the related anti¬ 
biotic medermycin, or Streptomyces violaceoruber B1140 or Tii22, both of 
which produce the related antibiotics granaticin and dihydrogranaticin. 

Each of the antibiotics actinorhodine, medermycin, granaticin, and 
dihydrogranaticin (Fig. 13.23) functions as an acid-base indicator, confer¬ 
ring on a growing culture a characteristic color that depends on the pH of 
the medium (Table 13.4). The pH (and color), in turn, depends on the 

FIGURE 13.23 Structures of various isochromanequinone antibiotics produced by 
Streptomyces spp. Wild-type S. coelicolor and plasmid pIJ2303 encode actinorhodine, 
a Streptomyces sp. produces medermycin, and S. violaceoruber produces both grana¬ 
ticin and dihydrogranaticin. The hybrid antibiotics produced are mederrhodine A 
and dihydrogranatirhodine. 
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TABLE 13.4 Antibiotics produced by various Streptomyces strains and those transformed with plasmids pIJ2303 and pIJ2315 


Strain/plasmid 

Color of culture 


Acidic 

Alkaline 

An t ibiotic(s) 

S. coelicolor 

Red 

Blue 

Actinorhodine 

Streptomyces sp. 

Yellow 

Brown 

Medermycin 

Streptomyces sp./pIJ2303 

Red 

Blue 

Medermycin, actinorhodine 

Streptomyces sp./pIJ2315 

Red 

Purple 

Mederrhodine A, medermycin 

S. violaceoruber B1140 

Red 

Blue-purple 

Granaticin, dihydrogranaticin 

S. violaceoruber B1140/pIJ2303 

Red 

Blue-purple 

Granaticin, dihydrogranaticin, actinorhodine 

S. violaceoruber B1140 Tu22 

Red 

Blue-purple 

Granaticin, dihydrogranaticin 

S. violaceoruber B1140 Tu22/pIJ2303 

Red 

Blue-purple 

Dihydrogranatirhodine, actinorhodine 


Adapted from Hopwood et al., Nature 314:642-644,1985. 


compound(s) being synthesized. Mutants of the S. coelicolor parental strain 
that are unable to produce actinorhodine are colorless. The appearance of 
new colors (in some cases) following the transformation of Streptomyces sp. 
strain AM-7161, or S. violaceoniber B1140, or S. violaceoruber Tii22 with a 
plasmid carrying either all or some of the genes encoding the enzymes that 
synthesize actinorhodine suggests that a novel antibiotic has been pro¬ 
duced (Fig. 13.23 and Table 13.4). Streptomyces sp. strain AM-7161 and S. 
violaceoruber B1140 transformants containing pIJ2303 produce the antibi¬ 
otics encoded by both the chromosomal and plasmid DNAs. However, 
when S. violaceoruber Tii22 is transformed with pIJ2303, a new antibiotic— 
dihydrogranatirhodine—is synthesized, along with actinorhodine. When 
Streptomyces sp. strain AM-7161 is transformed with pIJ2315, a second new 
antibiotic—mederrhodine A—is produced. 

These new antibiotics represent minor structural variants of the preex¬ 
isting antibiotics actinorhodine, medermycin, granaticin, and hydrograna- 
ticin and probably arise when an intermediate compound from one 
biosynthetic pathway acts as a substrate for an enzyme from the other 
pathway. As the biochemistry of various antibiotic biosynthetic pathways 
has been better understood, it has become possible to design unique anti¬ 
biotics by genetic manipulation of the genes encoding the relevant 
enzymes. 

Engineering Polyketide Antibiotics 

The term "polyketide" defines a class of antibiotics that are synthesized 
through the successive enzymatic condensation of small carboxylic acids, 
such as acetate, propionate, and butyrate. Polyketide drugs include the 
antibiotic erythromycin, the immunosuppressive drug FK506, and the 
cholesterol-lowering drug lovastatin. While various polyketides are pro¬ 
duced by plants and fungi, the majority are produced by actinomycetes as 
secondary metabolites. To create new polyketide antibiotics, the func¬ 
tioning of the enzymes that synthesize these antibiotics must be under¬ 
stood before the genes encoding the enzymes can be manipulated. 

Polyketide antibiotics are synthesized by a complex enzymatic mecha¬ 
nism analogous to that used for the synthesis of long-chain fatty acids. 
Each condensation cycle results in the formation, on a growing carbon 
chain, of a (3-keto group. Polyketide synthesis consists of a number of steps 
that are each repeated several times, including ketoreduction, dehydration. 
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FIGURE 13.24 Schematic representation 
of polyketide synthase for aromatic 
polyketides. (A) The active site may be 
on a single polypeptide. (B) Alterna¬ 
tively, the enzyme can consist of assem¬ 
blies of polypeptides with separate and 
distinct active sites. Both types of 
enzymes have different domains 
(regions A through E), and each domain 
has a separate enzymatic activity. 


and enoylreduction of the (3-group of the growing polyketide chain. There 
are two classes of polyketide synthases that are responsible for the syn¬ 
thesis of polyketide antibiotics (Fig. 13.24). The synthases that catalyze the 
biosynthesis of aromatic polyketides make up one class and generally con¬ 
sist of one polypeptide with an active site for each successive reaction (Fig. 
13.24A). The second class includes synthases that are assemblies of several 
polypeptides that have separate and distinct active sites for every catalyzed 
step in polyketide biosynthesis (Fig. 13.24B). These enzymes have a number 
of different domains (regions A through E) (Fig. 13.24), and each domain 
has a separate enzymatic activity and active site catalyzing a particular step 
in the process. The complete synthesis of a polyketide antibiotic generally 
requires the participation of several of these multifunctional enzymes; 
together, they make up the subunits of the polyketide synthase. 

If each of the enzymatic activities that is catalyzed by a domain on a 
multifunctional polyketide synthase subunit catalyzes only a single bio¬ 
chemical step in the pathway, the loss of any one activity should affect only 
a single step in the overall synthesis. Moreover, alteration of a catalytic 
domain whose function has been established should allow researchers to 
make predictable changes to the structure of the synthesized antibiotic. For 
example, a detailed knowledge of the genetics and biochemistry of the 
components involved in the synthesis of the antibiotic erythromycin 
allowed researchers to alter the biosynthetic genes in a predetermined 
manner and to produce predictably altered derivatives of erythromycin. 
Erythromycin is synthesized by Saccharopolyspora erythmea, and the entire 
56-kb DNA segment that contains the ery gene cluster has been sequenced. 
The erythromycin polyketide synthase was altered in two different ways: 
either (1) a DNA region that encoded (3-ketoreductase activity was deleted 
or (2) a DNA region encoding enoylreductase was mutated. With the 
(3-ketoreductase deletion, the erythromycin intermediates that accumu¬ 
lated had a carbonyl moiety rather than a hydroxyl group at the C-5 carbon 
of the ring (Fig. 13.25). Similarly, with the enoylreductase mutation, a 
carbon-carbon double bond was introduced at positions C-6 and C-7 of the 
ring (Fig. 13.25). These experiments indicate that once the cluster of genes 
encoding the biosynthesis of a particular polyketide antibiotic has been 
isolated and characterized, it is possible, by altering specific DNA frag- 


FIGURE 13.25 Altered erythromycin derivatives produced through genetic manipu¬ 
lation. (A) A mutation in an enoylreductase gene caused a carbon-carbon double 
bond to be introduced at positions C-6 and C-7 of the ring (highlighted). (B) A dele¬ 
tion in a (3-ketoreductase gene caused the erythromycin to have a carbonyl moiety 
rather than a hydroxyl group at the C-5 carbon of the ring (highlighted). Adapted 
from Katz and Donadio, Annu. Rev. Microbiol. 47:875-912,1993. 
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merits, to modify an antibiotic biosynthesis pathway and thereby to alter 
the structure of the antibiotic in a predictable manner. Moreover, it is pos¬ 
sible to cut and splice DNA fragments and thereby shift polyketide syn¬ 
thase domains around to create novel polyketide antibiotics. 

The gene clusters for aromatic polyketides all contain a set of three 
genes encoding a so-called minimal polyketide synthase. Each minimal 
polyketide synthase contains the activities for one cycle of polyketide chain 
elongation. The minimal module has a ketosynthase (with an acyltrans- 
ferase domain), a chain length factor, and an acyl carrier protein. The min¬ 
imal polyketide synthase is responsible for the synthesis of the aromatic 
polyketide backbone. Modifications to the basic structure are catalyzed by 
other enzymes acting in concert with the minimal polyketide synthase. The 
order of the modules in a polyketide synthase specifies the sequence of the 
distinct two-carbon units, and the number of modules determines the size 
of the polyketide chain. The genes encoding a complete set of these proteins 
are generally organized into a single cluster (Fig. 13.26). Each minimal 
polyketide synthase gene cluster encodes the synthesis of a particular anti¬ 
biotic. By interchanging genes between clusters, new aromatic polyketide 
antibiotics have been created (Fig. 13.27). This experiment demonstrates the 
potential of using genetic manipulation to design and produce novel aro¬ 
matic polyketide antibiotics. This approach promises to dramatically accel¬ 
erate the process of discovery of new antibiotics. 

Improving Antibiotic Production 

In addition to being a means of developing new antibiotics, genetic engi¬ 
neering can be used to enhance the yields and rates of production of known 
antibiotics. The large-scale production of antibiotics by Streptomyces spp. is 
often limited by the amount of oxygen available to the cells. The low solu¬ 
bility of oxygen in aqueous media, combined with the highly dense nature 


FIGURE 13.26 Gene clusters for the biosynthesis of the aromatic polyketide antibi¬ 
otics actinorhodine (act), tetracenomycin (tcm), frenolicin (fren), and griseusin (gris). 
Each cluster contains genes encoding a minimal polyketide synthase (PKS), which 
is responsible for the synthesis of the polyketide backbone. The enzymes encoded 
by the other genes act to modify the growing polyketide chain. Each gene is shown 
pointed in the direction in which it is transcribed. 
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FIGURE 13.27 Theoretical biosynthetic pathway for the production of the rationally 
designed polyketides SEK43 and SEK26 by interchanging gene clusters, act, actinor- 
hodine; tcm, tetracenomycin; fren, frenolicin; gris, griseusin; min PKS, minimal 
polyketide synthase; KR, |3-ketoreductase; ARO, aromatase; CYC, cyclase. 
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of filamentous Streptomyces cultures, often results in an oxygen-depleted 
culture medium, a condition that causes poor cell growth and reduced 
antibiotic yield. To overcome this problem, it may be possible to improve 
the design of the bioreactors that are used to grow antibiotic-producing 
Streptomyces cultures and to develop, by genetic manipulation, Streptomyces 
strains that are better able to utilize the available oxygen. These two 
approaches are not mutually exclusive. 

One strategy used by some aerobic microorganisms to cope with 
oxygen-poor environments is to synthesize a hemoglobin-like molecule that 
can sequester oxygen from the medium and then deliver it to the cells. For 
example, the aerobic bacterium Vitreoscilln produces a homodimeric heme 
protein that is functionally similar to eukaryotic hemoglobin. The gene for 
the Vitreoscilln hemoglobin was isolated and subcloned onto a Streptomyces 
plasmid vector. Following expression of the Vitreoscilln hemoglobin gene in 
S. coelicolor, the Vitreoscilln hemoglobin represented approximately 0.1% of 
the total cellular protein, even though the expression was controlled by the 
native Vitreoscilln hemoglobin gene promoter rather than by a Streptomyces 
promoter. When both transformed and nontransformed S. coelicolor cultures 
were grown in the presence of a low level of dissolved oxygen (i.e., approx¬ 
imately 5% saturation), the transformed cells with a functional Vitreoscilln 
hemoglobin produced 10 times more actinorhodine per gram (dry weight) 
of cells and had greater cell densities than did the nontransformed cells. The 
expression of the Vitreoscilln hemoglobin gene in oxygen-starved microbial 
cells may provide them with a general mechanism for obtaining sufficient 
oxygen to allow proliferation under otherwise limiting conditions. 

The compound 7-aminocephalosporanic acid (7ACA) is synthesized 
chemically from the antibiotic cephalosporin C (Fig. 13.28) and is used as 
the starting material for the chemical synthesis of a number of cephem-type 
antibiotics (cephalosporins). Cephalosporins have few toxic effects on 
humans and protect against many different bacteria. Unfortunately, there is 
no known organism that can synthesize 7ACA. However, a novel 7ACA 
biosynthetic pathway has been constructed in the fungus Acremonium chry- 
sogenum, which normally synthesizes only cephalosporin C. The genes 
involved in this novel engineered pathway consist of a cDNA that encodes 
D-amino acid oxidase and that comes from the fungus Fusnrium solnni and 
genomic DNA that encodes cephalosporin acylase and comes from the 
bacterium Pseudomonns diminutn. Both of these genes were subcloned sepa¬ 
rately onto an A. chrysogenum plasmid expression vector under the control 
of an A. chrysogenum promoter. In the first step of this new pathway, cepha¬ 
losporin C is converted into the compound 7-(3-(5-carboxy-5- 
oxopentanamido)cephalosporanic acid (keto-AD-7ACA) by D-amino acid 
oxidase (Fig. 13.28). Some of this product reacts with the hydrogen per¬ 
oxide that is a by-product of the reaction to form 7-P-(4-carboxybutanamido) 
cephalosporanic acid (GL-7ACA) (Fig. 13.28). Cephalosporin C, keto-AD- 
7ACA, and GL-7ACA are each hydrolyzed by cephalosporin acylase to 
form 7ACA. However, in the absence of the D-amino acid oxidase step, only 
5% of the cephalosporin C is converted to 7ACA; therefore, both enzymes 
are essential for high yields of 7ACA. Although the level of 7ACA that 
could be produced using this system was not sufficient to make this work 
the basis of a commercially viable process, it nevertheless demonstrates the 
feasibility of producing 7ACA biologically 

Medically important cephalosporins may be synthesized from either 
7ACA or the related compound 7-aminodeacetoxycephalosporanic acid 
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FIGURE 13.28 Genetically engineered biosynthetic pathway for the synthesis of 
7ACA from cephalosporin C. The D-amino acid oxidase gene is from the fungus F. 
solani, and the cephalosporin acylase gene is from the bacterium P. diminuta. 

(7ADCA). Disrupting the functioning of the A. chrysogenum cefEF gene 
results in the accumulation of large amounts of penicillin N (Fig. 13.29). 
Moreover, when this mutant strain is transformed with a cefE gene from 
Streptomyces clavuligerus, penicillin N is converted to deacetoxycepha- 
losporin C (DAOC), which can then be converted to 7ADCA. At this stage 
of the development of this process, not all of the penicillin N is converted 
to DAOC; however, it may be possible to further increase the expression of 
the cefE gene. Thus, it is reasonable to expect that transgenic A. chrysogenum 
can eventually be engineered to produce large amounts of 7ADCA. 

Many antibiotic-producing organisms are slow growing, require spe¬ 
cial growth conditions, or yield only small numbers of cells. To overcome 
these problems, £. coli was engineered to produce polyketide antibiotics at 
rates that are potentially useful for drug production. To do this, three genes 
(each 10 to 12 kb in length) encoding the components of the polyketide 
synthase from the bacterium S. erythraea were expressed in E. coli. Then, a 
Bacillus subtilis gene that produces an enzyme that attaches the cofactor 
phosophopantetheine to the polyketide synthase was cloned into the engi¬ 
neered E. coli. In addition, to supply the polyketide synthase with sufficient 
building blocks—propionyl-CoA and methylmalonyl-CoA—for polyketide 
synthesis, the E. coli gene encoding an enzyme that breaks down propionyl- 
CoA was inactivated and an S. coelicolor gene for propionyl-CoA carboxy¬ 
lase was introduced. Given the relative ease with which £. coli can be 
genetically manipulated and then grown in large-scale culture, this work 
may be a significant breakthrough for the development and production of 
new antibiotics. 

Designer Antibiotics 

In recent years, there has been an enormous proliferation in the prevalence 
of antibiotic-resistant bacterial infections. At present in the United States, 
more deaths are attributable to infections by methicillin-resistant strains of 
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Staphylococcus aureus than to human immunodeficiency virus/AIDS. An 
important component of S. aureus virulence is the carotenoid pigment 
staphyloxanthin that is synthesized by the bacterium. As a consequence of 
the large number of conjugated double bonds that this compound pos¬ 
sesses, it can detoxify the reactive oxygen species that are produced by the 
host immune system in response to the bacterial infection. On the other 
hand, strains of S. aureus that do not contain staphyloxanthin are rapidly 
inactivated by the reactive oxygen species produced by host neutrophils. 
This observation suggests that the disruption of S. aureus staphyloxanthin 
synthesis might be a suitable target to prevent the proliferation and toxicity 
of the bacterium. 

In the first committed step in the biosynthesis of S. aureus staphyloxan¬ 
thin, the enzyme dehydrosqualene synthase condenses two molecules of 
farnesyl diphosphate to produce presqualene diphosphate (Fig. 13.30). In 
S. aureus, this compound is modified to yield dehydrosqualene, which is 
converted into 4,4'-d i a po neu rospo rene and eventually into staphyloxan¬ 
thin. Interestingly, the synthesis of cholesterol in humans also proceeds 
through presqualene diphosphate. In order to determine whether the S. 
aureus enzyme that catalyzed the first committed step could be a target for 
a designer antibiotic, the bacterial gene was cloned and overexpressed, the 
enzyme was purified to homogeneity, and its X-ray crystallographic struc¬ 
ture was determined to 1.58-A resolution. Based on the three-dimensional 
structure of the enzyme, as well as the presence of inhibitors of the human 
enzyme (from the cholesterol synthesis pathway) that performed the same 
function, a series of eight chemical inhibitors was designed, synthesized 
chemically, and tested. Three of the inhibitors, when tested at levels below 
1 pM, dramatically blocked the conversion of farnesyl diphosphate to 
presqualene diphosphate (which is highly pigmented and therefore easy to 
visualize). In fact, one of the best inhibitors turned out to be a drug candi¬ 
date that had already undergone preliminary testing in humans for its 
ability to lower cholesterol levels. Despite the fact that these results are 
preliminary, the compound that was selected using this strategy caused a 
98% decrease in surviving S. aureus in infected mice. Thus, while much 
remains to be done, this work demonstrates that by choosing a nonconven- 
tional target it is possible to develop designer antibiotics that render patho¬ 
genic bacteria susceptible to human and animal immune systems. 
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FIGURE 13.29 Synthesis of the compound 
7ADCA in a genetically engineered 
strain of A. chrysogenum. The func¬ 
tioning of the endogenous A. chry¬ 
sogenum cefEF gene is disrupted so that 
penicillin N accumulates. This strain is 
transformed with a cefE gene from S. 
clavuligerus, and the penicillin N is 
transformed into DAOC, which is sub¬ 
sequently converted to 7ADCA. 


Biopolymers 

Biopolymers are large, multiunit macromolecules synthesized by microor¬ 
ganisms, plants, and animals. Some of these polymers have physical and 
chemical properties that are useful to the food-processing, manufacturing, 
and pharmaceutical industries. The ability to genetically engineer organ¬ 
isms has stimulated researchers to design new biopolymers, replace syn¬ 
thetic polymers with biological equivalents, modify existing biopolymers 
to enhance their physical and structural characteristics, and find ways to 
increase the yields and decrease the costs of biopolymers produced by 
industrial processes. 

Xanthan Gum 

Xanthomonas campestris is a gram-negative obligatory aerobic soil bacte¬ 
rium that produces the commercially important biopolymer xanthan gum. 
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FIGURE 13.30 Simplified overview of the biosynthesis of staphyloxanthin in S. aureus 
and cholesterol in humans. The first step (the committed step) is the same in both 
pathways. 


a high-molecular-weight exopolysaccharide, as a by-product of its metabo¬ 
lism. This polymer has a cellulosic backbone made up of a straight-chain 
polymer of glucose units. Each of its trisaccharide side chains includes one 
glucuronic acid and two mannose residues, which are attached to every 
second glucose residue of the backbone (Fig. 13.31). Xanthan gum has high 
viscosity, is stable in extreme physical and chemical environments, and 
exhibits physical and chemical properties similar to those of a plastic. In 
particular, its physical properties make it useful as a stabilizing, emulsi¬ 
fying, thickening, or suspending agent. For successful commercial produc¬ 
tion of xanthan gum, X. campestris should be grown on an inexpensive and 
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FIGURE 13.31 Structure of xanthan gum. The repeating unit, designated n, forms a 
chain of glucose units. The trisaccharide is attached to alternate glucose residues of 
the repeating chain. 


plentiful carbon source. Wild-type X. campestris can efficiently utilize glu¬ 
cose, sucrose, and starch, but not lactose, as a carbon source. 

Whey is a waste by-product of the cheese-making process that consists 
of water (94 to 95%), lactose (3.5 to 4%), and small amounts of protein, 
minerals, and low-molecular-weight organic compounds. Enormous quan¬ 
tities of whey are generated by the dairy industry, and its disposal is a 
major problem. In North America, whey has been used extensively as a 
"filler" in the preparation of prepared foods; however, with the increasing 
awareness that large numbers of individuals are lactose intolerant, it is 
imperative that alternative uses be found for this material. Moreover, dis¬ 
posing of whey by releasing it into rivers and lakes can deplete the amount 
of available oxygen, thereby killing many of the aquatic organisms. 
Transporting whey to landfill sites is exceptionally expensive, and poten¬ 
tial groundwater contamination by the discarded whey is a major concern. 
Finally, the costs of removing the solid component of whey are prohibitive. 
Consequently, many schemes have been devised to use whey in creative 
ways. 

Theoretically, whey could be used as a carbon source for growing 
industrially important microorganisms. With this in mind, X. campestris 
was genetically engineered to grow on whey. The £. coli lacZY genes, which 
encode the enzymes (3-galactosidase and lactose permease, were cloned 
onto a broad-host-range plasmid under the transcriptional control of an X. 
campestris bacteriophage promoter (Fig. 13.32). This construct was intro¬ 
duced into E. coli and then transferred from E. coli to X. campestris by tripar- 
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FIGURE 13.32 Engineering E. coli lacZ (encoding |3-galactosidase) and lacY (encoding 
lactose permease) genes for constitutive expression in X. campestris (Xc). 


tite mating. Transformants maintained the plasmid, expressed the enzymes 
p-galactosidase and lactose permease at high levels, utilized lactose as the 
sole carbon source, and produced large amounts of xanthan gum with glu¬ 
cose, lactose, or whey as the carbon source (Table 13.5). By contrast, wild- 
type X. campestris produces large amounts of xanthan gum only when 
grown on glucose (Table 13.5). This system may well be able to convert a 
nuisance waste product into a substrate for the production of an economi¬ 
cally valuable biopolymer. 

Melanin 

Melanins are a large, diverse family of light-absorbing biopolymers that are 
synthesized by animals, plants, bacteria, and fungi. It has been suggested 
that these pigments might be useful as topical sunscreens, sunlight-protec¬ 
tive coatings for plastics, or additives for cosmetic products. Currently, 
melanins are obtained in small quantities either by extraction from natural 
sources or by chemical synthesis. However, recombinant DNA technology 
has made it possible to produce a range of melanins with different physical 
properties inexpensively and on a large scale. 

Biochemically, melanins are irregular, somewhat random polymers 
that are composed of indoles, benzthiazoles, and amino acids. The first step 
in their synthesis, which is catalyzed by the copper-containing monooxy¬ 
genase tyrosinase, is the oxidation of tyrosine to dihydroxyphenylalanine 
quinone. The final stages of the polymerization of melanin are nonenzy- 
matic, and depending on the chemical nature of the nonquinone compo¬ 
nents that are incorporated into the polymeric structure (typically 
hydroxylated organic compounds), the end product can be black, brown, 
yellow, red, or violet. 

The genes involved in melanin biosynthesis in the bacterium 
Streptomyces antibioticus have been isolated and analyzed. These genes were 


TABLE 13.5 Production of xanthan gum by wild-type and transformed X. campestris 


X. campestris 

Amount of xanthan gum produced (|ig/i 

nL) with: 

0.4% Glucose 

0.4% Lactose 

10% Whey 

Wild type 

3,530 

245 

224 

Transformant 

3,711 

3,608 

4,241 


Adapted from Fu and Tseng, Appl. Environ. Microbiol. 56:919-923,1990. 

The amount of the product is expressed as micrograms per milliliter of culture grown on a minimal 
medium either 0.4% glucose or 0.4% lactose added or on diluted whey (10%), which contains approxi¬ 
mately 0.44% lactose. The transformant carries the E. coli lacZY genes on a plasmid. 
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selected from a clone bank of S. antibioticus DNA on the basis of the ability 
to change color in the presence of specific compounds that were added to 
the medium. They consist of two open reading frames (ORFs), one encoding 
tyrosinase (molecular weight, 30,600) and one (ORF438) encoding a protein 
of unknown function with a molecular weight of approximately 14,800. To 
test whether both of these genes are required for melanin production, they 
were subcloned into an E. coli expression vector, where one construct con¬ 
tained only the tyrosinase gene and another carried both the tyrosinase and 
the ORF438 genes (Fig. 13.33). The vector with the tyrosinase gene directed 
the synthesis of a larger amount of tyrosinase than did the vector con¬ 
taining both the tyrosinase and the ORF438 genes. However, the amount of 
tyrosinase was irrelevant, because it turned out that melanin biosynthesis 
required the products of both genes. The protein encoded by ORF438 may 
act as a copper donor to apotyrosinase, the inactive precursor form of tyro¬ 
sinase. Apotyrosinases are activated by acquiring copper ions. Under nat¬ 
ural conditions, after dihydroxyphenylalanine quinone is produced by 
tyrosinase, a variety of low-molecular-weight compounds (nonquinones) 
can be incorporated into the final polymer. The chemical and physical 
nature, including the color, of the melanin that is formed after cloning of 
the key genes into E. coli may be manipulated to some extent, to form 
melanins with different properties, by the addition of different amounts of 
specific low-molecular-weight compounds to the medium. 

Adhesive Protein 

Researchers are trying to inexpensively produce an adhesive protein biopo¬ 
lymer, originally isolated from the blue mussel Mytilus edulis, in microbial 
cells. This biopolymer is an exceptionally strong, waterproof adhesive pro¬ 
tein, called byssal adhesive, that enables the mussel to attach very tightly 
to a variety of surfaces. Following its secretion, the byssal adhesive becomes 
highly cross-linked (randomly), and consequently, the protein cannot be 
sequenced. Without this information, it is impossible to deduce nucleic acid 
sequences that might be used for the synthesis of DNA hybridization 
probes. However, it was possible to isolate an intracellular precursor form 
of the adhesive protein, called the 130-kilodalton (kDa) precursor protein, 

FIGURE 13.33 E. coli expression plasmids carrying melanin biosynthesis genes. 
Plasmid pBGC619 contains the tyrosinase gene. Plasmid pBGC620.3 contains an 
ORF (ORF438) for melanin synthesis and the tyrosinase gene. Transcription of the 
cloned genes is under the control of the E. coli bacteriophage T7 promoter (p T7 ). 
RBS1 and RBS2 denote two different ribosome-binding sites. The plasmids both 
carry genes that confer resistance to ampicillin (Amp r ). 
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MILESTONE 


Production of 2-Keto-L-Gulonate, an Intermediate in 
L-Ascorbate Synthesis, by a Genetically Modified 
Erivinia herbicola 

S. Anderson, C. B. Marks, R. Lazarus, J. Miller, K. Stafford, 

J. Seymour, D. Light, W. Rastetter, and D. Estell 
Science 230 : 144 - 149,1985 


T heoretically, prokaryotes that 

produce specific metabolites can 
be genetically manipulated in 
two different ways. First, the activity 
or amount of one or more of the 
enzymes in a pathway encoding the 
synthesis of a particular metabolite 
can be modified so that the amount of 
metabolite that the bacterium pro¬ 
duces is increased. Second, foreign 
genes that produce enzymes that can 
use an endogenous metabolite as a 
substrate for the production of another 
metabolite not normally produced by 


the host bacterium can be introduced. 
While these sorts of manipulations are 
easy in theory, it is not necessarily 
easy to isolate and manipulate the 
required genes or to establish the 
appropriate conditions that enable 
complex biosynthetic pathways to 
function properly. 

To create a bacterium that synthe¬ 
sized 2-keto-L-gulonic acid, which is 
the immediate precursor of commer¬ 
cially synthesized vitamin C, 
Anderson et al. isolated the 
Corynebacterium gene encoding the 


enzyme that converts 2,5-diketo-D-glu- 
conic acid to 2-keto-L-gulonic acid and 
transferred it to an Erivinia sp., a bac¬ 
terium that synthesizes 2,5-diketo-D- 
gluconic acid from D-glucose. The 
isolation of this gene was difficult 
because the enzyme had not been pre¬ 
viously studied to any great extent. 
Therefore, before the gene could be 
isolated, the protein had to be purified 
and partially sequenced so that DNA 
hybridization probes based on the 
amino acid sequence of the protein 
could be designed. This work is an 
early example of what some workers 
have come to call metabolic engi¬ 
neering, which entails taking the 
genetic information for part of a meta¬ 
bolic pathway from one organism and 
transferring it into another organism 
to create a novel metabolic pathway. 


that can be analyzed biochemically. It was found that the 130-kDa pre¬ 
cursor protein is rich in serine, threonine, lysine, proline (Pro), and tyrosine; 
60 to 70% of the amino acids contain a hydroxyl group. Most of the proline 
residues are hydroxylated to either 3- or 4-hydroxyproline (Hyp), and the 
majority of the tyrosines are hydroxylated to 3,4-dihydroxyphenylalanine 
(DOPA). Amino acid sequence analysis of the precursor protein further 
revealed that it is composed largely of repeating units that consist of a 
decapeptide with the sequence Ala-Lys-(Pro or Hyp)-Ser-(Tyr or DOPA)- 
Hyp-Hyp-Thr-DOPA-Lys; 7 of these 10 amino acids are hydroxylated. 

The cDNA for the 130-kDa precursor adhesive protein was isolated 
from a cDNA library that was constructed with messenger RNA (mRNA) 
isolated from the gland that actively secretes the byssal adhesive. Both the 
adhesive protein and the cDNA have unusual features that might make 
cloning, expression, and production of a functional adhesive protein diffi¬ 
cult in a heterologous host. First, the highly repetitive nature of the adhe¬ 
sive protein cDNA could make it unstable as a result of homologous 
recombination and subsequent loss of portions of the cloned sequence. 
Second, proline, lysine, and tyrosine represent about 70% of the amino 
acids of the protein; therefore, very high levels of synthesis may not be 
achievable because the corresponding intracellular aminoacyl-transfer 
RNA (tRNA) pools might be limiting. 

When either complete or partial cDNAs for the adhesive protein were 
cloned onto yeast expression vectors and introduced into yeast cells, active 
novel forms of the adhesive protein, ranging from 20 to 100 kDa, were syn¬ 
thesized and represented a significant fraction (2 to 5%) of the total cell 
protein. Thus, there were no problems concerning either the stability of the 
cloned cDNA or the production of moderate amounts of the adhesive pro¬ 
tein. Considerably higher expression levels were attained when a chemi¬ 
cally synthesized adhesive protein gene sequence was expressed in E. coli 
(Fig. 13.34). In this case, repeating DNA units that encode the consensus 
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decapeptide repeat of the adhesive protein were used to construct a 600- 
base pair (bp) synthetic gene that encoded a protein with a molecular mass 
of approximately 25 kDa. The 30-bp repeat, the fundamental building block 
of the synthetic gene, consisted of codons optimized for E. coli expression. 
The synthetic gene was expressed at very high levels by using the T7 pro¬ 
moter. Notwithstanding the level of expression of the adhesive protein, 
most microorganisms are limited in their abilities to hydroxylate amino 
acids posttranslationally, so the final protein might be underhydroxylated. 
In fact, a number of tyrosine residues of the protein were not converted to 
DOPA, an outcome that limits the ability of the protein to form cross-links. 
This deficiency was overcome by creating an in vitro hydroxylation system 
that used a bacterial tyrosinase in the presence of ascorbic acid to hydroxy¬ 
late the tyrosine residues to DOPA (Fig. 13.35). Ascorbic acid was included 
in the reaction mixture to prevent the premature oxidation of the DOPA 
residues to o-quinone. Oxidation must be controlled because it leads to 
cross-linking of the adhesive protein subunits. Like many other adhesives 
or glues, the protein adhesive must not be activated (cross-linked) before 
its actual use. 

When the precursor form of the adhesive protein is oxidized, the cross- 
linked protein can bind to a variety of surfaces, including polystyrene, 
glass, hydrogel, and collagen. Moreover, the "strength" and specificity of 
the final adhesive can be manipulated by adding other proteins to the 
adhesive protein mixture before oxidation and cross-linking. By varying 
the kinds and amounts of the accessory proteins, adhesives with unique 
properties can be created. It is anticipated that biopolymeric adhesives will 
be used extensively in both medicine and dentistry. 

More recently, researchers have isolated and expressed in E. coli the 
cDNA for the type 5 foot protein (adhesive protein) from the mussel 
Mytilus galloprovincialis. In this case, the protein that is produced in recom¬ 
binant E. coli cells contains a tag of six histidine residues (to facilitate pro¬ 
tein purification), as well as a slightly larger amount of DOPA than is found 
in the M. edulis protein. A higher number of DOPA residues is believed to 
result in a protein with greater adhesive properties. Unfortunately, the 
adhesive properties of this protein make it extremely difficult to purify (i.e., 
it sticks to everything), thereby limiting its commercial possibilities. 

Rubber 

Natural rubber, ris-l,4-polyisoprene, is an extensively used biopolymer 
that is obtained from a large number of different plants. The biosynthesis 


FIGURE 13.34 Synthetic oligonucleotide used in the assembly of a gene for the bioad¬ 
hesive protein produced by the mussel M. edidis. Two oligonucleotides are designed 
to base pair and form a DNA module with complementary extensions, which base 
pair to form a linear DNA molecule with a repeating sequence. The repeating units 
are joined by the enzyme T4 DNA ligase. The amino acid sequence encoded by the 
DNA repeat is shown. The decapeptide repeating unit contains three hydroxylated 
amino acids (Tyr, Ser, and Thr) plus three Pro residues which are subsequently 
hydroxylated. 


5' CCA ACC TAC AAA GCT AAG CCG TCT TAT CCG 3' 

3' TTT CGA TTC GGC AG A AT A GGC GGT TGG ATG 5' 
Pro -Thr -Tyr—Lys—Ala —Lys - Pro — Ser — Tyr—Pro —Pro —Thr —Tyr 
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Tyrosine DOPA o-Quinone 

FIGURE 13.35 Pathway for in vitro posttranslational hydroxylation of some of the 
tyrosine residues in the M. edulis adhesive protein. Tyrosine is converted to DOPA 
by the action of the enzyme tyrosinase and then can be oxidized to o-quinone by 
either catechol oxidase or tyrosinase. The oxidation of DOPA to o-quinone can be 
prevented by the addition of ascorbic acid. 


of rubber starts from simple sugars and requires approximately 17 enzyme- 
catalyzed steps, with the final step being the polymerization of isopentenyl 
pyrophosphate onto an allylic pyrophosphate. The last step is catalyzed by 
the enzyme rubber polymerase. 

Studies have been undertaken to determine whether rubber can be 
synthesized by genetically engineered microorganisms. As an initial step in 
this direction, a cDNA library was constructed by using mRNA from the 
rubber-producing plant Hevea brasiliensis. This library was then screened 
with a short synthetic DNA hybridization probe whose sequence was 
based on the amino acid sequence of a portion of the rubber polymerase 
enzyme. Antibodies directed against the purified enzyme were used to 
prove unequivocally that the cloned cDNA expressed rubber polymerase. 
This cDNA clone can now be used, possibly in concert with other genes in 
the rubber synthesis pathway, in an attempt to produce natural rubber in a 
microbial system. Alternatively, it can be used as a source of rubber poly¬ 
merase to develop an in vitro catalytic system. In either case, research that 
might lead to a new synthetic route for the production of rubber is under 
way. 

Polyhydroxyalkanoates 

Polyhydroxyalkanoates are a class of biodegradable polymers that are pro¬ 
duced by a number of different microorganisms, most notably Alcaligenes 
eutrophus, and used as an intracellular carbon and energy storage material. 
These compounds have thermoplastic or elastic properties, depending on 
the polymer composition, and are being considered for use in the synthesis 
of a range of biodegradable plastics. It has been estimated that by 2012 the 
U.S. market for biodegradable plastics will be around $1 billion per year. 

Poly(3-hydroxybutyric acid) is the most thoroughly studied and char¬ 
acterized polyhydroxyalkanoate. Both the polymer and the A. eutrophus 
genes that encode its synthesis have been characterized. Poly(3- 
hydroxybutyric acid), its copolymer [poly(3-hydroxybutyrate-co-3-hy- 
droxyvalerate)], and another polyhydroxyalkanoate [poly(3-hydroxyvaleric 
acid)] are produced commercially in the United Kingdom by the fermenta¬ 
tion of A. eutrophus. 
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Although it is possible to produce poly(3-hydroxybutyric acid) as a by¬ 
product of the growth of A. eutrophus, the organism grows relatively slowly, 
requires a relatively low growth temperature (so that the fermentation 
vessel must be cooled), is difficult to lyse [making the purification of 
poly(3-hydroxybutyric acid) granules difficult], and utilizes only a limited 
number of carbon sources for growth (making production costs relatively 
high). On the other hand, when the genes for the biosynthesis of this 
polymer were transferred to £. coli, the resultant transformant grew rapidly 
to a high cell density and accumulated very large amounts (up to 95% of 
the dry cell weight) of poly(3-hydroxybutyric acid). Poly(3-hydroxybutyric 
acid) is synthesized from acetyl-CoA in three steps catalyzed by three 
enzymes (Fig. 13.36). The operon containing these genes was cloned into a 
plasmid as part of a 5.2-kb insert. Unfortunately, plasmids expressing the 
poly(3-hydroxybutyric acid) operon in E. coli were unstable. In the absence 
of selective pressure, such as the addition of antibiotics to the growth 
medium, about half of the £. coli cells lost the plasmid after approximately 
50 generations. Plasmid loss of this magnitude, while not a major concern 
in small-scale batch cultures, becomes more of a problem with large-scale 
or continuous cultures (see chapter 17). This problem was overcome by 
inserting the pnrB genetic locus from another plasmid onto plasmids car¬ 
rying the poly(3-hydroxybutyric acid) operon. This gene mediates plasmid 
stabilization by postsegregational killing of plasmid-free cells. The modi¬ 
fied plasmids were quite stable even though the poly(3-hydroxybutyric 
acid) was produced constitutively, which places a metabolic load on the 
cells. An added benefit of producing poly(3-hydroxybutyric acid) in £. coli 
instead of A. eutrophus is that when the poly(3-hydroxybutyric acid) is 
recovered by extraction with an alkaline hypochlorite solution, the polymer 
is degraded to a much lesser extent than when it is produced in A. eutro¬ 
phus. This is probably because most of the poly(3-hydroxybutyric acid) in 
£. coli is produced in a crystalline state, while in A. eutrophus it is amor¬ 
phous. Nevertheless, the polymers extracted from the two organisms had 
identical polymer properties. In addition, £. coli transformants synthesizing 
poly(3-hydroxybutyric acid) produced very little acetate, which can be 


FIGURE 13.36 Synthesis of poly(3-hydroxybutyric acid) from acetyl-CoA. The 
enzyme that catalyzes each of the reactions is shown to the right of the arrow. 
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FIGURE 13.37 Model of the genetically engineered microbial synthesis of the copo¬ 
lymer poly(3-hydroxybutyrate-co-3-hydroxyvalerate). 

deleterious for cell growth, presumably because all of the excess acetyl- 
CoA of the cell was converted to poly(3-hydroxybutyric acid) rather than 
acetate. 

The copolymer poly(3-hydroxybutyrate-co-3-hydroxyvalerate) has 
properties that are similar to those of polypropylene. Consequently, there 
is considerable commercial interest in the biological production of poly-(3- 
hydroxybutyrate-co-3-hydroxyvalerate). However, E. coli strains that 
expressed the three polymer biosynthetic genes synthesized only poly-(3- 
hydroxybutyric acid) and not the copolymer. This limitation was overcome 
with E. coli cells that were mutated at both th efndR and ntoC loci. The FadR 
protein is a negative regulator of fatty acid biosynthesis, and the fndR 
mutant activates the glyoxylate shunt, enhancing the capacity for energy 
metabolism and biosynthesis, which leads to a reduction of acetate excre¬ 
tion and improvement of the biomass yield. The atoC gene product is a 
positive regulator of fatty acid uptake, and the gene product from the atoC 
mutation turns on the synthesis of the proteins encoded by atoA and ntoD , 
whose gene products facilitate the uptake of propionate from the growth 
medium into the cell. The propionate is converted to propionyl-CoA and 
then condensed with acetyl-CoA to form 3-ketovaleryl-CoA, which can be 
converted into 3-hydroxyvaleryl-CoA before its incorporation into the 
copolymer (Fig. 13.37). The amount of 3-hydroxyvalerate in the copolymer 
is dependent on the percentage of propionate used during the fermenta¬ 
tion, but it never exceeds 40%. 

In addition to engineering the composition of polyhydroxyalkanoate, to 
produce polymers with specific desired properties, it is also essential that 
the chain lengths of these polymers be regulated. Some polymerizing 
enzymes from bacteria such as Ralstonin eutropha yield primarily short-chain 
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polymers with approximately 4 or 5 of the 3-hydroxyalkanoate monomeric 
units, while other enzymes from different bacteria, including Pseudomonas 
oleovorans, produce medium-chain polymers that include 6 to 14 monomeric 
units. In addition, mutants of the various steps of fatty acid oxidation can be 
generated and used to produce monomeric units with a modified composi¬ 
tion compared with the wild-type strain. These modified monomers may 
then be incorporated into the polyhydroxyalkanoate. As a consequence of 
these manipulations, a wide range of polyhydroxyalkanoates with different 
physical and chemical properties have been synthesized. 

It would be economically advantageous if bacteria could be engineered 
to efficiently produce polyhydroxyalkanoates using industrial waste prod¬ 
ucts as a carbon source. To achieve this goal, polyhydroxybutyrate biosyn¬ 
thesis genes from a strain of an Azotobacter sp. were spliced onto a plasmid 
under the transcriptional control of the lac promoter (Fig. 13.38). The con¬ 
structed plasmid was introduced into an E. coli strain that contained genes 
for the uptake and assimilation of lactose but that did not encode the lac 
repressor (see chapter 6). Thus, both lactose uptake and assimilation genes, 
as well as polyhydroxybutyrate biosynthesis genes, were expressed consti- 
tutively. The £. coli strain that carried the constructed plasmid was able to 
grow on either 25% lactose (a by-product of cheese making) or com steep 
liquor (a by-product of corn [maize] processing) and to produce a significant 
level of polyhydroxybutyrate. By growing the transformed E. coli strain 
aerobically in a fed-batch culture (see chapter 17), after 24 hours, the cells 
accumulated polyhydroxybutyrate to 73% of their cell dry weight. Moreover, 
the physical properties of the polyhydroxybutyrate that was produced were 
similar to the properties of the polymer isolated from the Azotobacter sp. 
This engineered £. coli strain may be a suitable vehicle for producing a 
variety of polyhydroxyalkanoates from industrial waste products. 

Hyaluronic Acid 

Hyaluronic acid is a glycosaminoglycan, a polymer consisting of a repeating 
disaccharide unit of D-glucuronic acid and D-N-acetylglucosamine linked 
by [3-1,4 and [3-1,3 glycosidic bonds (Fig. 13.39), that in vivo can range in 
size from 5 to 20 kDa. This polymer is a component of the articular carti¬ 
lage, where it is present as a coat around the cells; it is important in tissue 
hydrodynamics, movement, and cell proliferation; and it is used to treat 
osteoarthritis and to facilitate wound healing. Hyaluronic acid is also used 
as a component of some cosmetics and skin moisturizers. In 2005, the 
worldwide market for hyaluronic acid was a little over $1 billion, with 
most being supplied from rooster combs or the outer capsule of strains of 
group C Streptococcus. Both sources of hyaluronic acid can be problematic. 


FIGURE 13.38 Azotobacter sp. genes encoding enzymes responsible for the biosyn¬ 
thesis of polyhydroxybutyrate under the transcriptional control of the E. coli lac 
promoter. The arrow indicates the direction of transcription. When a plasmid con¬ 
taining this construct is introduced into an E. coli strain that does not encode the lac 
repressor, the pha genes are expressed constitutively. phaA, 3-ketothiolase; phaB, 
acetoacetyl-CoA reductase; phaC, polyhydroxyalkanoate synthase. 
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The rooster comb-based product can cause severe inflammation in indi¬ 
viduals allergic to avian antigens, while the Streptococcus -based product is 
both difficult and expensive to produce. It would therefore be advanta¬ 
geous to have an alternative source of hyaluronic acid. 

B. subtilis is a well-established industrial bacterium that can secrete 
large amounts of synthesized products while at the same time being very 
economical to grow on inexpensive medium on a large scale. In addition, 
B. subtilis does not produce any exo- or endotoxins or the enzyme 


FIGURE 13.40 Flowchart of the engineering of B. subtilis to produce hyaluronic acid. 
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hyaluronidase (which degrades hyaluronic acid). The Streptococcus equi- 
similis gene encoding the last (and key) step in the synthesis of hyaluronic 
acid was isolated and then overexpressed in B. subtilis (Fig. 13.40), along 
with two B. subtilis genes that encode enzymes that provide the metabolites 
needed for the synthesis of hyaluronic acid. Following the large-scale 
growth in a bioreactor of this engineered B. subtilis strain, the amount of 
hyaluronic acid that was produced was comparable to the level produced 
by streptococcal strains (which grow more slowly), and the hyaluronic acid 
was secreted into the medium and not cell associated (as is the case with 
streptococcal strains), making it easier to isolate and purify. While this 
system may require some additional manipulation of the B. subtilis host 
strain to increase the yield of hyaluronic acid, this work is an important 
step toward the development of a commercial system for the bacterial pro¬ 
duction of hyaluronic acid. 


SUMMARY 


I n addition to using bacteria as factories for the production 
of proteins, such as restriction enzymes, it is possible to 
modify the metabolic pathways of organisms, either by intro¬ 
ducing new genes or by altering existing ones. In this way, 
various organisms can be genetically engineered for the pro¬ 
duction of a range of low-molecular-weight compounds, such 
as L-ascorbic acid, indigo, amino acids, antibiotics, lycopene, 
succinic acid, and the monomeric subunits of various biopoly¬ 
mers, such as xanthan gum, melanin, adhesive protein, rubber, 
polyhydroxyalkanoates, and hyaluronic acid. Flere, the 
strategy is to insert the genes for one or more specific enzymes 
into the host organism by transformation with a vector-cloned 


gene construct. When they are expressed, the inserted genes 
encode a new pathway or augment a preexisting pathway for 
the synthesis of a specific compound. In addition, the biosyn¬ 
thesis of a desired compound may often be significantly 
increased by modulating the metabolic flux of the organism 
by turning on some pathways and blocking others. Several 
studies have shown that the creation of such unusual enzy¬ 
matic pathways is technically feasible. Moreover, recombinant 
DNA technology has led to the development of new and more 
efficient synthetic routes for a variety of important com¬ 
pounds. 
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REVIEW QUESTIONS 


1. Describe a strategy for isolating the gene for the restriction 
endonuclease EcoRI. 

2. Outline a strategy for cloning the gene for 2,5-diketo-D- 
gluconic acid reductase from Corynebacterium into Erwinia. 
Why is this useful? 

3. Suggest a strategy for improving the commercial utility of 
a cloned 2,5-diketo-D-gluconic acid reductase gene. 

4. How can indigo be produced in E. coli? 

5. Outline a strategy for increasing the production of the 
amino acid tryptophan by C. glutamicum. 


6. Suggest a strategy for isolating some of the genes that are 
involved in the biosynthesis of the antibiotic undecylprodigi- 
osin, which is normally synthesized by S. coelicolor. 

7. Why is it difficult to genetically transform various 
Streptomyces spp.? How can this difficulty be overcome? 

8. Suggest a simple strategy for increasing the yield of an 
antibiotic by the genetic manipulation of the Streptomyces 
strain that produces the antibiotic. 

9. Suggest an approach for producing modified versions of 
polyketide antibiotics, such as erythromycin. 
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10. How can an adhesive protein biopolymer that is normally 
produced by the blue mussel M. edulis be synthesized in E. 
coli? 

11. Suggest a scheme for producing poly(3-hydroxybutyric 
acid) in E. coli. 

12. What is whey? How can it be used to produce industrially 
important compounds? 

13. How can E. coli be engineered to overproduce cysteine? 

14. Suggest a scheme for the isolation of a lipase gene from 
the bacterium P. alcaligenes. How might this gene be used in a 
practical way? 

15. How can the very large DNA fragments encoding antibi¬ 
otic biosynthesis genes be introduced into host bacteria? 


16. What strategies can be employed to produce large amounts 
of either 7ACA or 7ADCA in bacteria? 

17. How can E. coli be engineered to produce lycopene? 

18. How can the level of succinic acid produced by the bacte¬ 
rium M. succiniproducens be increased? 

19. What is hyaluronic acid? How can it be produced in B. 
subtilis? 

20. How can E. coli be genetically engineered to overproduce 
valine? 

21. How can foreign proteins be expressed at high levels in 
Streptomyces spp.? 
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Bioremediation and 
Biomass Utilization 


F or centuries, humans believed that atmospheric, terrestrial, and 
aquatic systems were sufficient to absorb and break down wastes 
from population centers, industry, and farming. We now know that 
this is not true. Today, there are two fundamental problems. First, how do 
we dispose of the large quantities of wastes that are continually being pro¬ 
duced? Second, how do we remove the toxic compounds that have been 
accumulating at dump sites, in the soil, and in water systems over the last 
few decades? Governments have tried to meet the challenge of environ¬ 
mental contamination by instituting antipollution regulations, but these 
rules often remain unenforced. Governments have also encouraged the 
three R's: reduce, reuse, and recycle. 

Researchers are currently testing a number of technological strategies, 
including biotechnological schemes, to deal with large-scale wastes, such 
as lignocellulosics and toxic substances that persist in ecosystems. 

The term "bioremediation" has been introduced to describe the process 
of using biological agents to remove toxic wastes from the environment. 
"Biomass" is the term used to describe the materials that are produced by 
the food and agricultural industries (e.g., starch and lignocellulosics) that 
were discarded as waste in the past. Biomass is now being considered as a 
source material for the production of a variety of economically important 
products. 

Microbial Degradation of Xenobiotics 

The problem of toxic waste disposal is enormous. Worldwide production 
in 1985 of just one chemical that is released into the environment—pen- 
tachlorophenol—was more than 50,000 tons. Incineration and chemical 
treatment have been used to break down many toxic chemicals, but these 
methods are costly and often create new environmental difficulties. With 
the discovery in the mid-1960s of a number of soil microorganisms that are 
capable of degrading xenobiotic ("unnatural," or synthetic; from the Greek 
xenos, meaning "foreign") chemicals, such as herbicides, pesticides, refrig¬ 
erants, solvents, and other organic compounds, the notion that microbial 
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TABLE 14.1 Some Pseudomonas plasmids, their degradative pathways, and sizes 


Name of plasmid 

Compound(s) degraded 

Plasmid size 
(kilobases) 

SAL 

Salicylate 

60 

SAL 

Salicylate 

68 

SAL 

Salicylate 

72 

SAL 

Salicylate 

83 

TOL 

Xylene and toluene 

113 

CAM 

Camphor 

225 

XYL 

Xylene 

15 

NAH 

Naphthalene 

69 

OCT 

Octane, D-camphor 

-500 

NAH7 

Naphthalene, salicylate 

83 

PJP1 

2,4-Dichlorophenoxyacetic acid 

87 

PJP2 

2,4-Dichlorophenoxyacetic acid 

54 

PJB3 

2,4-Dichlorophenoxyacetic acid 

78 

pP51 

1,2-Di, 1,4-di-, and 1,2,4-trichlorobenzene 

110 

pAC31 

3,5-Dichlorobenzoate 

108 

pAC25 

3-Chlorobenzoate 

102 

pWWO 

Xylene and toluene 

117 

pWWIOO 

Biphenyl 

200 

pWWO 

Xylene and toluene 

176 

pXYL-K 

Xylene and toluene 

135 

pVI150 

Phenol 

>200 

pNLl 

Xylene, naphthalene, biphenyl 

184 

pAC27 

3-Chlorobenzoate 

110 

pHMT112 

Benzene 

112 

pTDNl 

Aniline, m- and p-toluidine 

79 


Plasmids with the same name encode similar degradative pathways, even though they have different 
sizes and were described in different laboratories. 


degradation might provide an economical and effective means of disposing 
of toxic chemical wastes gained credence. 

Members of the genus Pseudomonas are the most predominant group of 
soil microorganisms that degrade xenobiotic compounds. Biochemical 
assays have shown that various Pseudomonas strains can break down and, 
as a consequence, detoxify hundreds of different organic compounds. In 
many cases, one strain can use any of several different related compounds 
as its sole carbon source. 

The biodegradation of complex organic molecules generally requires 
the concerted efforts of several different enzymes. The genes that code for 
the enzymes of these biodegradative pathways are sometimes located in 
the chromosomal DNA, although they are more often found on large 
(approximately 50- to 200-kilobase) plasmids (Table 14.1). In some organ¬ 
isms, the genes that contribute to the degradative pathway are found in 
both chromosomal and plasmid DNA. 

Degradative bacteria, in most cases, enzymatically convert xenobiotic, 
nonhalogenated aromatic compounds to either catechol (Fig. 14.1) or pro- 
tocatechuate (Fig. 14.2). Then, through a series of oxidative cleavage reac¬ 
tions, catechol and protocatechuate are processed to yield either acetyl 
coenzyme A (acetyl-CoA) and succinate (Fig. 14.3) or pyruvate and acetal¬ 
dehyde (Fig. 14.4), compounds that are readily metabolized by almost all 
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L-Mandelate Toluene 



FIGURE 14.1 Pathways for the enzymatic conversion of aromatic compounds to cat¬ 
echol by degradative bacteria. 
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FIGURE 14.2 Pathways for the enzymatic conversion of aromatic compounds to pro- 
tocatechuate by degradative bacteria. 

organisms. Halogenated aromatic compounds, which are the main com¬ 
ponents of most pesticides and herbicides, are converted to catechol, pro- 
tocatechuate, hydroquinones, or the corresponding halogenated derivatives 
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4-Oxoadipate enol lactone 
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XOOH 
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FIGURE 14.3 The ortho-cleavage pathway for enzymatic conversion of catechol and 
protocatechuate to acetyl-CoA and succinate. 


by the same enzymes that degrade the nonhalogenated compounds. 
However, for the halogenated compounds, the rate of degradation is 
inversely related to the number of halogen atoms that are initially present 
on the target compound. Dehalogenation, the removal of a halogen sub¬ 
stituent from an organic compound, is the critical requirement for detoxi¬ 
fication and often occurs by a nonselective dioxygenase reaction that 
replaces the halogen on a benzene ring with a hydroxyl group. This step 
may occur either during or after the biodegradation of the original haloge¬ 
nated compound. 
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▼ 

Pyruvate + acetaldehyde 2 Pyruvate 

FIGURE 14.4 The meta-cleavage pathway for the enzymatic conversion of catechol 
and protocatechuate to pyruvate and acetaldehyde. 


Genetic Engineering of Biodegradative Pathways 

Despite the ability of many naturally occurring microorganisms to degrade 
a number of different xenobiotic chemicals, there are limitations to the bio¬ 
logical treatment of these waste materials. For example, (1) no single micro¬ 
organism can degrade all organic wastes; (2) high concentrations of some 
organic compounds can inhibit the activity or growth of degradative 
microorganisms; (3) most contaminated sites contain mixtures of chemi¬ 
cals, and an organism that can degrade one or more of the components of 
the mixture may be inhibited by other components; (4) many nonpolar 
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compounds adsorb onto particulate matter in soils or sediments and 
become less available to degradative microorganisms; and (5) microbial 
biodegradation of organic compounds is often quite slow. One way to 
address some of these problems is to transfer by conjugation into a recip¬ 
ient strain plasmids that carry genes for different degradative pathways 
(Fig. 14.5). If two resident plasmids contain homologous regions of DNA, 
recombination can occur and a single, larger "fusion" plasmid with com¬ 
bined functions can be created. Alternatively, if two plasmids do not con¬ 
tain homologous regions and, in addition, belong to different incompatibility 
groups, they can coexist within a single bacterium. 

Manipulation by Transfer of Plasmids 

Bacterial strains with expanded degradative capabilities were first created 
in the 1970s by Chakrabarty and coworkers. They used different plasmids 
to construct a bacterial strain that degraded a number of the hydrocarbon 
components found in petroleum (Fig. 14.5). This strain has been called a 
"superbug" because of its increased metabolic capabilities. The CAM 


FIGURE 14.5 Schematic representation of the development of a bacterial strain that 
can degrade camphor, octane, xylene, and naphthalene. Strain A, which contains a 
CAM (camphor-degrading) plasmid, is mated with strain B, which carries an OCT 
(octane-degrading) plasmid. Following plasmid transfer and homologous recombi¬ 
nation between the two plasmids, strain E carries a CAM and OCT biodegradative 
fusion plasmid. Strain C, which contains an XYL (xylene-degrading) plasmid, is 
mated with strain D, which contains an NAH (naphthalene-degrading) plasmid, to 
form strain F, which carries both of these plasmids. Finally, strains E and F are 
mated to yield strain G, which carries the CAM/OCT fusion plasmid, the XYL 
plasmid, and the NAH plasmid. 


CAM plasmid 


OCT plasmid 


XYL plasmid 


NAH plasmid 



Strain G 
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(camphor-degrading) plasmid was transferred by conjugation into a strain 
carrying the OCT (octane-degrading) plasmid. These two plasmids were 
incompatible and could not be maintained in the same cell as separate 
plasmids. However, when recombination occurred between the two plas¬ 
mids, the resulting single plasmid was perpetuated and carried both cam¬ 
phor- and octane-degradative activities. The NAH (naphthalene-degrading) 
plasmid was transferred by conjugation into a strain carrying the XYL 
(xylene-degrading) plasmid. The NAH and XYL plasmids were compatible 
and could therefore coexist within the same host cell. Finally, the CAM/ 
OCT fusion plasmid was transferred by conjugation into the strain carrying 
the NAH and XYL plasmids. The final result of these manipulations was 
the generation of a strain that grew better on crude oil than did any of the 
single-plasmid strains either alone or in combination. 

Although this particular multiple-degradative strain has not been used 
to clean up oil spills, it has played a critical role in the development of the 
biotechnology industry. The inventor of this "superbug" was granted a 
U.S. patent describing its construction and use. This was the first patent 
ever granted for a genetically engineered microorganism and represented 
a watershed court decision, because it implied that biotechnology compa¬ 
nies could protect their inventions in the same way as the chemical and 
pharmaceutical industries had in the past. 

Most of the degradative bacteria that have been genetically manipu¬ 
lated by plasmid transfer are mesophiles, organisms that grow well only at 
temperatures between 20 and 40°C. However, rivers, lakes, and oceans that 
are polluted generally have temperatures that range from 0 to 20°C. To test 
whether bacteria with enhanced degradative abilities could be created for 
cold environments, a TOL (toluene-degrading) plasmid from a mesophilic 
Pseudomonas putida strain was transferred by conjugation into a facultative 
psychrophile, an organism with a low temperature optimum. The host 
psychrophile was able to degrade salicylate, but not toluene, and to use it 
as a sole carbon source at temperatures as low as 0°C. The transformed 
strain carried the introduced TOL plasmid and its own SAL (salicylate¬ 
degrading) plasmid and was able to use either salicylate or toluene as its 
sole carbon source at 0°C (Table 14.2). The wild-type (nontransformed) 


TABLE 14.2 Generation times of wild-type (nontransformed) and 
transformed psychrophilic strains of P. putida on salicylate or 
toluate as the sole carbon source at various temperatures 


Temperature 

rc) 


Generation time (h) for: 


Wild-type + 

Transformant + 

Transformant + 

salicylate 

salicylate 

toluate 

37 

No growth 

No growth 

No growth 

30 

2.2 

2.5 

2.0 

25 

2.1 

3.2 

1.3 

20 

2.6 

3.8 

1.9 

15 

3.2 

4.2 

2.9 

10 

6.3 

5.6 

3.3 

5 

13.9 

12.9 

12.2 

0 

28.6 

18.1 

24.4 


Adapted from Kolenc et al., Appl. Environ. Microbiol. 54:638-641,1988. 

The wild-type strain is unable to utilize toluate for growth at any temperature because it 
lacks the enzymes to metabolize the compound. 
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FIGURE 14.6 The meta-cleavage pathway and the xyl operon of the toluene- and 
xylene-degrading plasmid pWWO. Transcription of the xyl operon is controlled by 
the p m promoter, which is regulated by the XylS gene product, which in turn must 
be activated by one of the initial pathway substrates. The genes from xylX to xylH 
(X to H) are under the control of the p"‘ promoter. The xylS gene, which is not part 
of this operon, is constitutively expressed. Some of the primary substrates are ben¬ 
zoate, where R and R' = H; 3-methylbenzoate, where R = H and R' = CH 3 ; 3-ethyl- 
benzoate, where R = H and R' = CH 2 CH 3 ; and 4-methylbenzoate, where R = CH 3 
and R' = H. The xylXYZ genes encode toluene dioxygenase, xylL encodes dihy- 
droxycyclohexadiene carboxylate dehydrogenase, xylE encodes catechol 2,3-dioxy¬ 
genase, xylF encodes hydroxymuconic semialdehyde hydrolase, xylG encodes 
hydroxymuconic semialdehyde dehydrogenase, xylH encodes 4-oxalocrotonate 
tautomerase, xyll encodes 4-oxalocrotonate decarboxylase, xylj encodes 2-oxopent- 
4-enoatehydratase, and xylK encodes 2-oxo-4-hydroxypentonate aldolase. 


psychrophilic strain was unable to grow at any temperature when toluene 
(or toluate) was the only carbon source (not shown). This simple experi¬ 
ment indicates the feasibility of engineering psychrophilic degradative 
bacteria for use in the environment. 

Manipulation by Gene Alteration 

4-Ethylbenzoate. Bringing together different intact plasmid-based degra¬ 
dative pathways by conjugation is only one way to create bacteria with 
novel properties. It may also be possible to extend the degradative capa¬ 
bility of a strain by altering the genes of an existing degradative pathway. 
The feasibility of this approach was examined for the toluene- and xylene¬ 
degrading pathway of plasmid pWWO. This plasmid encodes a " meta¬ 
cleavage" pathway involving 12 different genes and enables pseudomonads 
carrying the plasmid to utilize various alkylbenzoates as carbon sources 
(Fig. 14.6). The genes in the toluene-xylene pathway of pWWO are part of 
a single operon, called the xyl operon, under the control of the p m promoter. 
Transcription from the p m promoter, by RNA polymerase, is positively 
regulated by the xylS gene product, which is activated by most of the initial 
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FIGURE 14.7 (A) Protocol used to create a 
XylS protein that is activated by 4-ethyl- 
benzoate. The p m promoter is cloned 
onto plasmid pBR322 and replaces the 
tetracycline resistance (Tet r ) gene pro¬ 
moter to form plasmid pJLR200; the 
xylS gene and its promoter are spliced 
onto a broad-host-range plasmid con¬ 
taining a kanamycin resistance (Kan r ) 
gene. E. coli is transformed with both of 
these plasmids. Transformants are 
selected by their resistance to both ampi- 
cillin (Amp 1 ) and kanamycin and then 
chemically mutagenized with ethyl 
methanesulfonate. Only cells with a 
mutation (S*) in the xylS gene that 
enables the XylS protein to be activated 
by 4-ethylbenzoate (EB) can grow on 
medium that contains both 4-ethylben- 
zoate and tetracycline, because only 
these cells are resistant to tetracycline. 




substrates, such as benzoate and 3-methylbenzoate, of the pathway (Fig. 
14.6). Detailed biochemical and genetic analyses showed that bacteria car¬ 
rying pWWO could degrade 4-ethylbenzoate, albeit slowly, to 4-ethylcate- 
chol, which accumulated in the medium, but no further. 4-Ethylcatechol 
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P. putida 



P. putidn 


FIGURE 14.7 (continued) (B) Protocol used to create a modified catechol 2,3-dioxyge- 
nase that is not inhibited by 4-ethylcatechol. A P. putida strain carrying pWWO is 
transformed with a broad-host-range plasmid carrying the mutated xylS* gene, 
whose product can activate the p"‘ promoter. Transformants are chemically muta- 
genized and then grown on minimal medium that contains 4-ethylbenzoate as the 
sole carbon source and kanamycin. Any cell that can grow on this medium carries 
a mutated catechol 2,3-dioxygenase gene. This mutation is indicated by a dot in the 
middle of the xyl gene cluster. 


prevented its own degradation by inactivating one of the most important 
enzymes in the biodegradative pathway, catechol 2,3-dioxygenase, the 
product of the xylE gene. In addition, 4-ethylbenzoate, unlike most other 
alkylbenzoates, does not activate the XylS protein; consequently, transcrip¬ 
tion of the operon from the p m promoter did not occur to any significant 
extent when 4-ethylbenzoate was the only substrate. Thus, there are two 
major problems with the naturally occurring meta-cleavage pathway 
system: (1) how to overcome the inactivation of an important enzyme in the 
degradative process by 4-ethylbenzoate and (2) how to induce transcrip¬ 
tion of the genes of this pathway with 4-ethylbenzoate as the inducer. 

To find a mutant that could solve the second problem, a tetracycline 
resistance gene was placed under the control of the p m promoter on one 
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plasmid, which also carried an ampicillin resistance gene. The xylS gene 
was cloned onto another plasmid carrying a kanamycin resistance gene. 
Transformants carrying both of these plasmids were selected on the basis 
of resistance to both ampicillin and kanamycin (Fig. 14.7A). The Escherichia 
coli cells carrying both of these plasmids were treated with the mutagen 
ethyl methanesulfonate and plated onto a medium containing both tetracy¬ 
cline and 4-ethylbenzoate. The only cells that could grow on this medium 
carried an altered XylS protein (S* in Fig. 14.7A) that could interact with 
4-ethylbenzoate and cause the tetracycline resistance gene to be tran¬ 
scribed. Thus, the degradative pathway that includes this mutated xylS 
gene can be induced by 4-ethylbenzoate. To address the catechol 2,3-dioxy¬ 
genase inactivation problem, the mutated xylS gene was subcloned onto a 
broad-host-range plasmid carrying a kanamycin resistance gene and intro¬ 
duced into P. putida cells carrying pWWO (Fig. 14.7B). The transformed 
cells were plated, at a high cell density, onto a minimal medium containing 
4-ethylbenzoate as the sole carbon source, kanamycin to select for the pres¬ 
ence of the plasmid, and ethyl methanesulfonate. Cells that were able to 
grow on this medium produced an altered form of the enzyme catechol 
2,3-dioxygenase that was not inhibited by 4-ethylcatechol. Additional 
analysis confirmed that the catechol 2,3-dioxygenase gene on pWWO had 
been mutated and that mutant versions of both the xylS gene and the cat¬ 
echol 2,3-dioxygenase gene were required for the degradation of 4-ethyl- 
benzoate. 

An important aspect of this work is the fact that the two genes that 
were altered, i.e., those encoding XylS and catechol 2,3-dioxygenase, are 
the major determinants of the range of compounds that can be degraded by 
this pathway. The work with 4-ethylbenzoate demonstrates that by com¬ 
bining recombinant DNA technology, conventional mutagenesis, and the 
appropriate selection protocols, novel properties can be added to a degra¬ 
dative pathway. 

Trichloroethylene. The compound trichloroethylene is widely used as a 
solvent and a degreasing agent, and as a result, it is one of the most 
common contaminants of soil and groundwater. Trichloroethylene persists 
in the environment for years, is a likely carcinogen, and is regulated in the 
United States under the Safe Water Drinking Act to a maximum contami¬ 
nant level of 5 parts per billion. Unfortunately, anaerobic soil bacteria can 
reductively dehalogenate it to produce vinyl chloride, which is an even 
more toxic compound. 

Studies showed that some of the strains of P. putida that could degrade 
aromatic compounds such as toluene could also degrade trichloroethylene. 
Genetic studies established that the complete meta-cleavage degradative 
pathway was not necessary to completely detoxify trichloroethylene. In 
fact, only the enzyme toluene dioxygenase, which normally catalyzes the 
oxidation of toluene to ds-toluene dihydrodiol, was required. 

Four genes (Fig. 14.8A) are involved in the production of a functional 
toluene dioxygenase. These genes were isolated and expressed in £. coli 
under the control of the strong and inducible tac promoter. When E. coli 
cells carrying these genes were induced by the addition of isopropyl-p-D- 
thiogalactopyranoside (IPTG), an inducer of the tac promoter, trichloroeth¬ 
ylene was efficiently broken down to harmless compounds by the concerted 
enzymatic activities of the Tod proteins encoded by these four genes (Fig. 
14.8B). Although the initial rates of trichloroethylene degradation were 
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lower with £. coli than with the original P. putida strain, the E. coli cells 
maintained these rates for longer periods than P. putida. It has been specu¬ 
lated that the basis for this difference may be that E. coli membranes are not 
as susceptible to damage from trichloroethylene as P. putida membranes 
are. 

In a variation of this experiment, a hybrid Pseudovionas strain with ele¬ 
ments of two separate degradative pathways was constructed. Bacterial 
strains that can degrade the compound biphenyl use the enzyme biphenyl 
dioxygenase. Biphenyl dioxygenase is a multicomponent enzyme encoded 
by four genes, bphAlA2A3A4, where bphAl encodes a large subunit of ter¬ 
minal dioxygenase (an iron-sulfur protein), bphA2 encodes a small subunit 
of terminal dioxygenase, bphA3 encodes ferredoxin, and bphA4 encodes 
ferredoxin reductase (Fig. 14.9). BphAl and BphA2 are associated as a het- 
erotetramer and catalyze the introduction of two oxygen atoms into the 
biphenyl ring. Ferredoxin and ferredoxin reductase act as an electron 
transfer system from reduced nicotinamide adenine dinucleotide (NADFI) 
to reduce the terminal dioxygenase. Biphenyl dioxygenase is quite similar 
in both structure and function to the enzyme toluene dioxygenase. Despite 
the similarities of their enzymes, biphenyl-utilizing pseudomonads cannot 
grow on toluene, and toluene-utilizing strains cannot grow on biphenyl. 
Flowever, when the bphAl gene (coding for the large subunit of biphenyl 
dioxygenase) from P. putida KF715 was replaced by homologous recombi¬ 
nation with the todCl gene (coding for the large subunit of the toluene 
dioxygenase) from P. putida FI, the resultant strain (Fig. 14.9) was able to 
degrade trichloroethylene (Table 14.3). In fact, the engineered strain grew 
well on a range of aromatic compounds and also was very efficient at 

FIGURE 14.8 A cloned toluene dioxygenase operon under the control of the tac pro¬ 
moter in E. coli. (A) Toluene dioxygenase activity is due to the products of four 
genes (todA, todB, todCl, and todC2). todA encodes a flavoprotein that accepts elec¬ 
trons from NADH and transfers them to a ferredoxin encoded by todB, which 
reduces the terminal dioxygenase that is encoded by todCl and todC2. These genes 
are equivalent to the genes xylXYZ shown in Fig. 14.7. (B) Toluene is converted to 
ris-toluene dihydrodiol by the concerted enzymatic activities of the Tod proteins. 
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Promoter bphAl bphA2 bphA3 bphA4 



FIGURE 14.9 Creation of a hybrid Pseudomonas strain with elements of two separate 
degradative pathways by replacement, through homologous recombination, of the 
biphenyl dioxygenase bphAl gene with a toluene dioxygenase todCl gene. 


degrading trichloroethylene, demonstrating that it is possible to rationally 
engineer bacterial strains that can degrade a number of different com¬ 
pounds. The creation of degradative bacteria with novel biological activi¬ 
ties was achieved by slightly different means by the creation of chimeric 
versions of the bphAl gene (Fig. 14.10). In this case, one bphAl gene was 
from a strain of Pseudomonas pseudoalcaligenes with the ability to degrade 
only a narrow range of polychlorinated biphenyls (PCBs), while the other 
was from Burkholderia cepacia, which can degrade a very wide range of 
PCBs. Some of the hybrid genes encoded an enzyme with a wider degrada¬ 
tive ability than either of the original enzymes. The native bphAl gene in P. 
pseudoalcaligenes was replaced with hybrid genes by homologous recombi¬ 
nation. It now remains to be seen whether these engineered bacterial 
strains degrade a range of PCBs on a large scale. 

Cell surface-expressed enzymes. Currently, detoxification of organophos- 
phate pesticides in the environment is performed by chemical treatment, 
incineration, or burial in landfill sites. Each of these approaches has serious 
environmental drawbacks. It would therefore be advantageous if bacteria 
that are able to degrade these compounds could be utilized in place of the 
methods that are currently used. Several soil bacteria, including Pseudomonas 
diminuta MG and Flavobacterium spp., possess an enzyme, organophos- 
phorus hydrolase, that catalyzes the hydrolysis of many of these pesticides, 
including methyl and ethyl parathion, paraoxon, chlorpyrifos (Dursban), 
coumaphos, cyanophos, and diazinon, to environmentally innocuous com¬ 
pounds. Unfortunately, these bacteria, as well as E. coli engineered to 
express organophosphorus hydrolase, degrade these pesticides relatively 
slowly because of the low rate of uptake into the bacterial cells. Thus, a 
novel approach was developed to solve this problem. E. coli cells were 
engineered to express organophosphorus hydrolase as part of a fusion pro- 


TABLE 14.3 Growth of parental and engineered Pseudomonas strains on various aromatic compounds 


Strain 



Growth on: 



Biphenyl 

Diphenylmethane 

Toluene 

Benzene 

Trichloroethylene 

P. putida KF715 

+ + + 

+ + + 

- 

- 

- 

P. putida FI 

- 

- 

+ + + 

+ + + 

+ 

P. putida KF715-D5 

+ + 

+ 

++ + 

+ + + 

+ + + 


Adapted from Suyama et al., J. Bacteriol. 178:4039-4046,1996. 

In P. putida KF715-D5, the bphAl gene from P. putida KF715 is replaced with the todCl gene from P. putida FI. +++, good growth; ++ moderate growth; +, poor 
growth; very poor or no growth. 
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FIGURE 14.10 Chimeric (hybrid) bphAl genes constructed from P. pseudoalcaligenes 
and B. cepacia bphAl genes. The chimeras were constructed by digesting both of the 
isolated genes with the restriction enzymes shown and then recombining the resul¬ 
tant fragments as indicated. The various transformants carrying these constructs 
were subsequently assayed for enzymatic (degradative) activity. 


tein that contained the E. coli lipoprotein signal peptide and the N-terminal 
portion of the lipoprotein and outer membrane protein A (Fig. 14.11). When 
this fusion protein was synthesized, it was localized on the outer surface of 
the bacterium (Fig. 14.12). This eliminated the problem of a low rate of 
pesticide uptake into the engineered bacterium. Cells with organophos- 
phorus hydrolase on their outer surfaces had approximately seven times 
higher activity than cells that expressed a similar amount of the enzyme 
intracellularly. Moreover, the enzyme activity of cells with organophos- 
phorus hydrolase on their surfaces was extremely stable. Nearly 100% of 
the activity was retained after 1 month at 37°C. This concept is quite prom¬ 
ising; however, to date, it has been tested only on a laboratory scale. 

Radioactive environments. The 26 countries worldwide that generate elec¬ 
tricity from nuclear power plants, as well as those countries that have 
nuclear weapon programs, have generated thousands of radioactive waste 
sites. In the United States alone, there are more than 3,000 of these sites, and 
it has been estimated that, using currently available technology, the cleanup 
will require around $200 billion and take approximately 70 years. In addi¬ 
tion to radioactivity, these sites also often have both organic and metal 
pollutants. While biodegradation of the organic pollutants is a logical first 
















































566 CHAPTER 14 


Ipp 

lac signal 

promoter sequence 


ompA opd 

transmembrane signal 

domain sequence opd gene 






Ipp 

fragment 


FIGURE14.il The DNAconstruct used to produce the fusion protein Lpp-OmpA-OPH. 
This construct includes the E. coli lac promoter, DNA encoding the E. coli Ipp signal 
sequence and the first 9 amino acids of the mature E. coli lipoprotein, the portion of 
the E. coli ompA gene encoding the transmembrane domain, and the gene (opd) for 
the Elavobacterium sp. organophosphorus hydrolase (OPH) and its signal peptide. 


step in their remediation, most microorganisms are highly sensitive to the 
damaging effects of the radiation (Fig. 14.13). Fortunately, the nonpatho- 
genic soil bacterium Deinococcus radiodurans is naturally resistant to quite 
high levels of ionizing radiation. This resistance has been attributed to 
DNA repair processes that are exceptionally effective at repairing DNA 
damage. Moreover, any DNA that is introduced into D. radiodurans, either 
as part of a plasmid or inserted into the chromosome, is also protected 
against high levels of potentially damaging radiation. Since this bacterium 
can express foreign genes while growing in the presence of continuous 
radiation, it would appear to be an ideal candidate for the expression of 
bioremediating proteins in toxic environments that contain radioactive 
contaminants. 

As a first step toward developing a system to remediate organic pollut¬ 
ants that are present in radioactive environments, the four genes that 
together code for toluene dioxygenase (Fig. 14.8) were placed on a plasmid 
under the control of a constitutive D. radiodurans promoter. The entire 
plasmid was then inserted into the chromosome of D. radiodurans by 
homologous recombination, a single crossover, between the chromosomal 
DNA and a chromosomal DNA fragment on the plasmid adjacent to the 
toluene dioxygenase genes. The integrated toluene dioxygenase was active 


FIGURE 14.12 Schematic representation of the fusion protein Lpp-OmpA-OPH 
anchored in the E. coli outer membrane with organophosphorus hydrolase on the 
outside of the cell and therefore exposed to the external medium. Lpp (shown in 
red) includes the first 9 amino acids from the E. coli lipoprotein. OmpA (shown in 
black) includes the transmembrane domain (amino acids 46 to 159) of E. coli outer 
membrane protein A. OPH (shown in blue) includes Elavobacterium sp. organophos¬ 
phorus hydrolase and its signal peptide. 
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FIGURE 14.13 Effect of y-irradiation on the growth of E. coli and D. radiodurans. 


and conferred upon D. radiodurans the ability to degrade toluene, chlo¬ 
robenzene, and 3,4-dichloro-l-butene irrespective of the presence or 
absence of high levels of ionizing radiation. The successful expression of 
toluene dioxygenase, an enzyme with several protein components and 
including metal and organic cofactors, suggests that many less complex 
biodegradative enzyme systems could also be expressed in D. radiodurans. 
Since D. radiodurans is tolerant of high levels of toluene, and a number of 
other organic compounds, once this bacterium has been genetically engi¬ 
neered to express the appropriate biodegradation pathway, it should be 
able to degrade a variety of organic pollutants in a radioactive environ¬ 
ment. However, it remains to be seen how these engineered organisms 
behave under field conditions. 

Nitroaromatics. For many years, a large number of different nitroaromatic 
compounds have been used industrially as dyes, plasticizers, explosives, 
solvents, and pesticides. Many of these compounds are recalcitrant to 
breakdown, persist in the environment, and are now considered to be toxic 
and sometimes carcinogenic pollutants. For example, the compound 
4-nitrophenol (Fig. 14.14) is formed by the hydrolysis of the insecticide 
parathion and is considered to be a priority environmental pollutant, in 
part because it leads to numerous human health problems. Similarly, the 
compound 3-methyl-4-nitrophenol (Fig. 14.14) is a toxic breakdown 
product of the agricultural insecticide fenitrothion. 

The bacterium Burkholderia sp. strain DNT facilitates the breakdown 
and detoxification of 2,4-dinitrotoluene. First, the enzyme 2,4-dinitrotol- 
uene dioxygenase removes one nitro group to form 4-methyl-5-nitrocate- 
chol (Fig. 14.14), which is then converted to 2-hydroxy-5-methylquinone by 
the enzyme 4-methyl-5-nitrocatechol monooxygenase. Unfortunately, the 
4-methyl-5-nitrocatechol monooxygenase has a very narrow substrate 
range, which includes 4-nitrocatechol (Fig. 14.14), as well as 4-methyl-5- 
nitrocatechol. However, this enzyme cannot efficiently use either of the 
seemingly similar substrates 4-nitrophenol or 3-methyl-4-nitrophenol. In 
an effort to expand the substrate range of the enzyme, the Burkholderia sp. 
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FIGURE 14.14 Breakdown of nitroaromatic compounds by a strain of Burkholderia sp. 
that produces either the native form or a modified form of the enzyme 4-methyl-5- 
nitrocatechol monooxygenase. 


strain DNT 4-methyl-5-nitrocatechol monooxygenase gene was isolated 
and subjected to error-prone PCR, and the randomly mutated genes were 
cloned into a plasmid vector. The library of mutated 4-methyl-5-nitrocate- 
chol monooxygenase genes was transferred into E. coli cells by electropora¬ 
tion, and the transformants were screened for activity on agar plates 
containing 4-nitrophenol. Cells that contained the wild-type gene turned 
the colonies light brown, while one transformant (of 3,000 tested) turned 
the colony dark brown, indicating that the 4-nitrophenol was being broken 
down. When the DNA sequence of the 4-methyl-5-nitrocatechol monooxy¬ 
genase gene from the transformant that turned the medium dark brown 
was determined, two amino acids within the encoded protein had been 
altered. Amino acid 22 was changed from methionine to leucine, and 
amino acid 380 was changed from leucine to isoleucine. These two amino 
acid alterations resulted in the altered enzyme having 10 times greater 
activity toward 4-nitrophenol and 4 times greater activity toward 3-methyl- 
4-nitrophenol than the native form of the enzyme. In addition, the modi¬ 
fied enzyme had about 50% more activity than the native enzyme toward 
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4-nitrocatechol and 4-methyl-5-nitrocatechol. The changes in activity that 
were found following the mutagenesis of the 4-methyl-5-nitrocatechol 
monooxygenase gene are an important first step in developing a bacterium 
that can effectively degrade 4-nitrophenol and 3-methyl-4-nitrophenol. 


Utilization of Starch and Sugars 

Starch, the major food reserve polysaccharide in plants, consists of a mixture 
of linear homopolymers (amylose) and branched homopolymers (amylo- 
pectin) of D-glucose. Amylose is made up of linear chains of 1 x 10 2 to 4 x 10 5 
D-glucose residues linked by a-1,4 bonds (Fig. 14.15A). Amylopectin con¬ 
sists of short linear chains of approximately 17 to 23 glucose units that are 
linked by a-1,4 bonds and joined by 1,6 linkages and some 1,3 linkages to 
form a highly branched structure that contains 1 x 10 4 to 4 x 10 7 glucose 
residues (Fig. 14.15B). The degree of branching and the ratio of amylose to 
amylopectin vary with the source and age of the starch. 


FIGURE 14.15 (A) Pathway for the enzymatic hydrolysis of amylose; (B) pathway for 
the enzymatic hydrolysis of amylopectin. The blue circles represent D-glucose resi¬ 
dues. 
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FIGURE 14.16 Industrial production of 
fructose and alcohol from starch. 


Commercial Production of Fructose and Alcohol 

The major use of starch is in the food and brewing industries, where it is 
usually hydrolyzed to low-molecular-weight components before being 
converted into other compounds, especially fructose and alcohol. The most 
important enzymes for breaking down and transforming starch are 
a-amylase, glucoamylase, and glucose isomerase. Together, these three 
enzymes account for approximately 30% of the cost of all enzymes cur¬ 
rently used for industrial processes. 

Fructose and alcohol are produced commercially from starch by multi- 
step processes that include both enzymatic and nonenzymatic reactions, as 
follows (Fig. 14.16). 

1. The procedure begins with the gelatinization of milled grain (often 
com [maize], which is approximately 40% starch). This treatment— 
steam cooking under pressure—exposes the surface area of the 
starch, thereby making it more readily available for subsequent 
enzymatic hydrolysis. The product of this process has a gel-like 
consistency. 

2. The gelatinized starch is cooled to 50 to 60°C, and a-amylase is 
added. In this liquefaction step, the gel-like starch is enzymatically 
digested by the hydrolysis of the available a-1,4-linkages to form 
low-molecular-weight polysaccharides. A high temperature is 
used because enzyme hydrolysis is more rapid at high tempera¬ 
tures and the enzyme does not efficiently penetrate the gelatinized 
starch at lower temperatures. 

3. The final step for the release of glucose includes the addition of 
glucoamylase and results in the saccharification—complete hydro¬ 
lysis—of the remaining polysaccharides, including both linear and 
cross-linked molecules. 


The end product of these treatments is glucose, which may be either 
converted into alcohol as a result of fermentation by yeast cells or trans¬ 
formed into fructose by the enzyme glucose isomerase. The success of the 
latter enzymatic conversion has led to the replacement of sucrose by fruc¬ 
tose, which is much cheaper than sucrose, as a sweetener for prepared 
foods and beverages in North America. The source of starch for commercial 
fructose production is generally corn, and the final product is called high- 
fructose com syrup or high-fructose syrup, although it contains a mixture 
of fructose and glucose. High-fructose com syrup is typically either 90% 
fructose (for baked goods), 55% fructose (in soft drinks), or 42% fructose (in 
sports drinks), with the remainder being glucose. 

The enzyme a-amylase randomly hydrolyzes a-1,4-linkages in both 
amylose and amylopectin chains, yielding a mixture of glucose, maltose 
(two glucose molecules joined by an a-1,4 linkage), maltotriose (three glu¬ 
cose molecules joined by a-1,4 linkages), and a series of a-limit dextrins, 
which are the portions of the amylopectin chains that contain cross-links 
(Fig. 14.15). Although a-amylase can be isolated from a variety of microor¬ 
ganisms, it is commonly obtained from Bacillus amyloliquefaciens for indus¬ 
trial purposes. 

For some applications, the enzyme [3-amylase is used in addition to or 
in place of a-amylase to digest starch. By hydrolyzing alternate a-1,4 link¬ 
ages from the ends of amylose and amylopectin chains, [3-amylase cleavage 
yields primarily maltose residues and various [3-limit dextrins. 
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MILESTONE 


Microorganisms Having Multiple Compatible 
Degradative Energy-Generating Plasmids and 
Preparation Thereof 

A. M. Chakrabarty 
U.S. patent 4,259,444,1981 


B efore the advent of recombinant 
DNA technology, one of the 
ways that DNA could be moved 
from one microorganism to another 
was by conjugation, in which entire 
plasmids were transferred between 
organisms. Chakrabarty transferred 
degradative plasmids—plasmids 
encoding all of the enzymes involved 
in the biodegradation of a particular 
compound—from one bacterium to 
another until he had constructed bac¬ 
terial strains that contained several 


degradative plasmids. Each degrada¬ 
tive pathway, encoded by the genes on 
a plasmid, degraded a different 
organic molecule. Starting with four 
separate bacteria, Chakrabarty and his 
coworkers constructed a single bacte¬ 
rium that contained the pathways to 
degrade camphor, octane, salicylate, 
and naphthalene. While this work was 
scientifically interesting and innova¬ 
tive in and of itself (especially since it 
was conducted in the early 1970s, 
before the development of most of the 


techniques of recombinant DNA that 
are now taken for granted), the key 
issue for the biotechnology industry 
was that this invention was awarded a 
U.S. patent in March 1981, nearly 9 
years after the application was first 
filed. Following the landmark decision 
by the U.S. Supreme Court in 
Chakrabarty's favor, it was ruled that 
genetically engineered microorgan¬ 
isms were inventions that could be 
patented in the same way as any other 
invention. As much as any landmark 
scientific experiment, this court deci¬ 
sion and the subsequent award of the 
patent has become a cornerstone of 
the development of the biotechnology 
industry. 


The enzyme glucoamylase hydrolyzes a-1,3, a-1,4, and a-1,6 linkages; 
however, because it is less efficient than a-amylase in cleaving a-1,4 link¬ 
ages, it is usually used in conjunction with a-amylase. The major role of 
glucoamylase is digestion of the cross-links of amylopectin, which results 
in its complete breakdown to glucose. Glucoamylase and other enzymes 
are used to reduce the carbohydrate (limit dextrin) content of normal beers 
to produce the so-called light and dry varieties. Although glucoamylase 
digestion is usually performed prior to the onset of the fermentation, the 
two steps may be combined. A number of organisms produce glucoamy¬ 
lase, but for industrial purposes, it is usually obtained from the fungus 
Aspergillus niger. 

Altering Alcohol Production 

The enzymes that are used in the production of alcohol or fructose from 
milled grain are major components of the overall cost of the process. These 
enzymes are often used only once and then discarded. Thus, innovative 
approaches to the inexpensive large-scale production of the enzymes could 
lower the cost of alcohol or fructose production. There are several ways to 
achieve this end. 

• Each of the enzymes could be overproduced in a fast-growing 
recombinant microorganism that utilizes an inexpensive substrate, 
thereby lowering the cost compared with production from native 
organisms. 

• Variants of a-amylase, either naturally occurring or genetically 
manipulated, that function efficiently at 80 to 90°C could be used to 
allow the liquefaction step to be performed at this temperature. 
Heat-resistant a-amylase would speed the hydrolysis of gelatinized 
starch while decreasing the amount of energy that is required to 
cool the gelatinized starch to a temperature suitable for starch 
hydrolysis. 
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FIGURE 14.17 Detection of a B. subtilis 
clone expressing an a-amylase gene. 
The halo results from the degradation 
of the starch in the medium in the 
vicinity of the clone by secreted 
a-amylase. 


• The a-amylase and glucoamylase genes could be altered so that each 
enzyme would have the same temperature and pH optimum, 
thereby enabling the liquefaction and saccharification steps to be 
performed under the same conditions. 

• An enzyme that could efficiently degrade raw starch could be found 
or engineered, obviating the need for the gelatinization step and 
thereby saving a large amount of energy. 

• A fermentation organism that can synthesize and secrete glucoamy¬ 
lase could be developed, eliminating the need to purify and add this 
enzyme during fermentation. 

A considerable amount of research has been initiated to determine whether 
these possibilities are feasible. 

Genes that code for a-amylase have been isolated from a number of 
organisms, including B. amyloliquefaciens and the high-temperature-tolerant 
bacterium Bacillus stearothermophilus. Briefly, chromosomal DNA was iso¬ 
lated, partially digested with the restriction enzyme Sau3AI, and then 
ligated to BamHI-digested pUBllO DNA. This plasmid has a unique 
BamHI site and carries a kanamycin resistance gene. The clone bank was 
transformed into Bacillus subtilis, which does not have a-amylase activity, 
and transformants were selected for resistance to kanamycin. All transfor¬ 
mants were tested for the production and secretion of a-amylase as fol¬ 
lows. After the transformants had formed colonies at 65°C on solid medium 
containing starch, the plates were exposed to iodine vapor. The colonies 
producing a-amylase were surrounded by a distinctive halo, or clear zone, 
indicating that the starch in the immediate vicinity of these cells had been 
hydrolyzed (Fig. 14.17). A positive starch-iodine test signifies that the 
transformed cell contains an a-amylase gene that is transcribed from its 
own promoter, because the vector does not carry a promoter. Also, a secre¬ 
tion signal is present, because the substrate is too large to enter the cell, and 
therefore the halo must be due to the activity of a secreted a-amylase. The 
availability of a-amylase genes from varied sources will enable researchers 
to carry out specific genetic modifications that suit the needs of specific 
industrial processes. 

With the aim of bypassing the saccharification step during the produc¬ 
tion of alcohol from starch, researchers isolated a full-length glucoamylase 
complementary DNA (cDNA) from the fungus Aspergillus awamori and 
cloned it into a Saccharomyces cerevisiae plasmid under the control of the 
promoter and transcription terminator regulatory signals from the yeast 
enolase (ENOl) gene. A laboratory strain of S. cerevisiae that was trans¬ 
formed with the plasmid carrying the glucoamylase cDNA was able to 
express this activity and to ferment soluble starch to alcohol, thus demon¬ 
strating that the approach is feasible. 

Unfortunately, this laboratory strain of S. cerevisiae has a number of 
properties that make it ill suited for use in a commercial process, including 
an inability to tolerate high levels of alcohol, inefficient expression of the 
glucoamylase cDNA, and loss of plasmids unless special conditions (selec¬ 
tive pressure) are used for their maintenance. These problems, however, are 
not insurmountable. First, the level of glucoamylase expression was 
increased approximately fivefold by deleting a 175-base-pair (bp) negative 
regulatory region from the ENOl promoter on the plasmid. Second, the 
plasmid was modified by deleting its yeast origin of replication and adding 
a segment of DNA that is homologous to a yeast chromosomal site, thereby 
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converting it to an integrating vector. With this form of the plasmid, the 
complete glucoamylase construct was incorporated into a chromosomal 
site and stably maintained. Third, another S. cerevisiae strain (brewer's 
yeast) that tolerates high levels of alcohol was used as the host cell. The 
integrating vector was used to transform this yeast strain. 

As a result of these modifications, investigators created two novel yeast 
strains that performed better than a naturally occurring amylolytic (starch¬ 
hydrolyzing) yeast, Saccharomyces diastaticus, which is closely related to S. 
cerevisiae and can hydrolyze and ferment soluble starch (Table 14.4). The 
performance of the brewer's strain, which contained the integrated glu¬ 
coamylase gene, was superior to that of the laboratory strain with the same 
gene on a multicopy plasmid. This difference probably reflects plasmid 
instability with the concomitant loss of the introduced glucoamylase gene. 
Neither the laboratory strain nor the brewer's strain of S. cerevisiae was able 
to utilize soluble starch unless the strain was transformed with the cloned 
glucoamylase gene. In both the plasmid and integrated forms, the A. 
awamori glucoamylase cDNA was under the control of regulatory signals of 
ENOl from which the 175-bp negative regulatory region had been removed. 
The plasmid was maintained under selective pressure. 

In an effort to produce large amounts of glucoamylase, several copies 
of the glucoamylase gene were integrated into the chromosomal DNA of 
the fungus A. niger. Surprisingly, there was no correlation between the 
number of copies of the glucoamylase gene in the chromosomal DNA and 
the amount of measured enzyme activity. On the other hand, the level of 
enzyme activity was strongly dependent upon the sites where the genes 
were inserted. Thus, merely increasing the gene copy number is not suffi¬ 
cient to produce a larger amount of active enzyme. 

While most strains of the yeast S. cerevisiae encode only an intracellular 
glucoamylase, which is generally expressed only during sporulation, a few 
strains encode a secreted glucoamylase. Unfortunately, none of these 
enzymes contains a starch-binding domain. Thus, while S. cerevisiae glu- 
coamylases can hydrolyze soluble starch (dextrins), they are unable to 
break down the larger molecules of insoluble starch. In an attempt to con¬ 
struct a strain of S. cerevisiae that could hydrolyze insoluble starch and 
therefore be of use in a variety of industrial processes, the coding region of 
a secreted S. cerevisiae glucoamylase gene was fused to a DNA fragment 
encoding a starch-binding domain from the fungus A. niger (Fig. 14.18). 
When this construct was expressed in S. cerevisiae, the presence of the 
starch-binding domain increased the ability of the enzyme to degrade 


TABLE 14.4 Fermentation of soluble starch (25% [wt/vol]) by various yeast strains 


Strain 

Carbohydrate 
utilized (%) 

Ethanol produced 
(g/liter) 

Ethanol yield 
(g/g of substrate) 

Laboratory 

5 

<0.1 

0 

Laboratory + gene on a 

68 

75.6 

0.41 

plasmid 

Brewer's 

<1 

3.1 

0 

Brewer's + integrated 

93 

118.2 

0.48 

gene 

S. diastaticus 

43 

44.2 

0.38 


Adapted from Cole et al., Bio/Technology 6:417-421,1988. 





pGAL Secretion domain Catalytic domain Starch-binding 

domain 


FIGURE 14.18 The genetic construct used to produce in S. cerevisiae a secreted glu- 
coamylase containing a starch-binding domain (from A. niger). The construct is 
transcribed under the control of the yeast pGAL promoter in the direction indicated 
by the arrow. 


insoluble starch by about sixfold. While this system is far from optimized, 
this simple genetic manipulation is an important initial step in the develop¬ 
ment of more efficient industrial systems for the production of alcohol from 
starch. 

Low-ethanol wines. In recent years, many wine drinkers have expressed a 
strong preference for wines that contain only low levels of alcohol. In addi¬ 
tion, consumer preferences have also moved to wines with a high flavor 
intensity that are prepared from fully matured grapes. However, the juice 
that is obtained from fully matured grapes generally contains a very high 
sugar concentration, which in turn produces wines with high levels of 
alcohol. A number of attempts have been made to engineer brewer's yeasts 
(S. cerevisiae) to reduce the ethanol content of the wine that is produced. 
Although a number of these approaches have successfully lowered the 
ethanol concentration, they typically cause the accumulation of undesir¬ 
able side products. Nevertheless, one novel strategy seems to have avoided 
many of the problems of past efforts. By expressing in S. cerevisiae a gene 
(noxE) from the bacterium Lactococcus lactis that encodes an H20-NADH 
oxidase (Fig. 14.19), it was possible to significantly alter some of the meta¬ 
bolic fluxes within the yeast cell. Transformed yeast cells that carried this 
gene within their chromosomal DNA had an intracellular NADH content 
that was about 75 to 80% lower than that of the native yeast strain and an 
oxidized nicotinamide adenine dinucleotide (NAD+) concentration that 
was 32 to 45% higher. At the same time, the reduced nicotinamide adenine 
dinucleotide phosphate (NADPH) and oxidized nicotinamide adenine 
dinucleotide phosphate (NADP+) ratios of the two yeast strains were iden¬ 
tical. The transformed yeast showed a 15% decrease in the amount of eth¬ 
anol produced and, unfortunately, increases of approximately threefold in 
the amounts of acetaldehyde and acetic acid that were produced, which 


FIGURE 14.19 (A) Oxidation of NADH catalyzed by the H 2 0-NADH oxidase encoded 
by the noxE gene from the bacterium L. lactis; (B) the bacterial noxE gene under the 
transcriptional control of the yeast glyceraldehyde 3-phosphate dehydrogenase 
gene promoter and the yeast phosphoglycerate kinase transcription terminator. The 
entire construct was integrated into the yeast chromosomal DNA. 
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impair both growth and fermentation, as well as imparting unacceptable 
flavors to the wine. To overcome these side effects, researchers undertook 
a systematic study of the growth of this modified yeast strain under a wide 
range of conditions. It was observed that if oxygen was supplied to the 
growing cells only during the stationary phase, there was a 7% reduction 
in the ethanol yield compared to the native strain, without the inhibitory 
levels of acetaldehyde and acetone being formed. Researchers are now 
investigating whether the wine that is produced using the above-men¬ 
tioned yeast strain is suitable for human consumption. 

Improving Fructose Production 

The enzyme glucose isomerase should really be called xylose/glucose 
isomerase, because it primarily catalyzes the conversion of the five-carbon 
sugar D-xylose to D-xylulose, with the conversion of D-glucose to D-fructose 
being a secondary or side reaction (Fig. 14.20). Kinetically, xylose/glucose 
isomerase has a lower k cat (catalytic rate constant) and a higher K„, (binding 
constant) for glucose than for xylose, which means that xylose is bound 
more tightly to the enzyme than is glucose and that xylose is converted 
more rapidly to xylulose than glucose is converted to fructose. 

Xylose/glucose isomerases are intracellular enzymes and, as such, do 
not yield the same quantities or purity of product as do the extracellular, or 
secreted, enzymes that are used in many industrial processes. Most indus¬ 
trial enzymes are used without any extensive purification. An extracellular 
enzyme preparation generally contains many fewer proteins than an intra¬ 
cellular extract. In addition, the preparation of an intracellular protein 
extract requires separation of the cells from the growth medium, mechan¬ 
ical disruption of the cells, and removal of cell debris following disruption. 
These factors lead to higher production costs for xylose/glucose isomerase 
than for many other industrial enzymes. One way to overcome this problem 
is to use a batch of xylose/glucose isomerase more than once. This recy¬ 
cling can be achieved by immobilizing the enzyme on a solid support, 
which both stabilizes the enzyme and facilitates its reuse. 

The isomerization of glucose to fructose is a reversible reaction, and the 
final fructose content is dependent on the reaction temperature. The higher 
the temperature, the greater the fructose content in the final product. Most 
commercial processes use conversion temperatures of around 60°C. The 
enzyme is typically used in an immobilized state that is obtained by cross- 
linking it to itself with glutaraldehyde and then using it in a continuous 
process in a packed bed reactor. Under these conditions, a batch of enzyme 
can be used for approximately 150 to 200 days before it is discarded. 
Consequently, increasing the temperature optimum for the enzymatic 


FIGURE 14.20 Conversion of glucose to fructose, catalyzed by glucose isomerase. 
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activity and the thermostability of xylose/glucose isomerase is one way to 
make it more efficient. 

The thermophilic bacterium Thermus thermophilus produces a xylose/ 
glucose isomerase that not only is active at 95°C, but also is very stable at 
high temperatures. Therefore, this enzyme is a good candidate for use in 
industrial processes. Unfortunately, wild-type T. thermophilus does not pro¬ 
duce large amounts of the enzyme. To circumvent this problem, the T. 
thermophilus xylose/glucose isomerase gene was isolated and expressed in 
E. coli and Bacillus brevis under the control of various promoters and ribo¬ 
some-binding sites (Table 14.5). One of the constructs (the last one listed in 
Table 14.5) overproduced xylose/glucose isomerase more than 1,000-fold 
relative to the amount found in the original organism. Therefore, with this 
construct, high yields of a thermostable xylose/glucose isomerase can be 
produced for the industrial synthesis of fructose from glucose. 

In addition, the substrate specificity of the enzyme can be enhanced. In 
one series of experiments, site-directed mutagenesis was used to change 
the nucleotides encoding either one or two amino acids of the xylose/glu- 
cose isomerase from the thermophilic organism Clostridium thermosulfuro- 
genes. The targeted sites were selected for modification because of other 
evidence indicating that the corresponding amino acids were involved in 
substrate binding. Changing the tryptophan at amino acid residue 139 to 
phenylalanine or the valine at amino acid residue 186 to threonine pro¬ 
duces a 1.7- to 2.6-fold increase in the catalytic efficiency (k cat /K m ) of the 
enzyme toward glucose (Table 14.6). Moreover, these changes cause a two- 
to sevenfold reduction in the /c cat / K m values of the enzyme toward xylose. 
When an enzyme has both of these amino acid changes, the /c cal / K m value 
for glucose increases by 5.7-fold and the k cat /K m value for xylose decreases 
by 4.5-fold. The double amino acid modification changes an enzyme that 
was initially 17 times more reactive with xylose than with glucose to one 
that is now 1.5 times more reactive with glucose than it is with xylose. The 
shift in specificity that has been achieved, together with the thermostability 
of this xylose/glucose isomerase, should make it attractive for use in the 
industrial conversion of glucose to fructose. 

Silage Fermentation 

Crops such as grasses, com, and alfalfa need to be preserved so that they can 
be used as animal feed many months after the crop is harvested. Traditionally, 
these crops are preserved by naturally occurring lactic acid bacteria that use 


TABLE 14.5 Amounts of T. thermophilus xylose/glucose isomerase in different bacteria 


Species 

Plasmid copy no. 

Promoter source 

Source of 

ribosome-binding site 

Enzyme activity 
(units/liter) 

T. thermophilus 

None 

T. thermophilus 

T. thermophilus 

20 

E. coli 

200 

E. coli lac 

T. thermophilus 

190 

E. coli 

20 

E. coli tac 

T. thermophilus 

1,790 

E. coli 

20 

E. coli tac 

E. coli 

3,260 

E. coli 

20 

Phage T7 flO 

E. coli 

7,050 

B. brevis 

20 

B. brevis cwp 

T. thermophilus 

1,400 

B. brevis 

20 

B. brevis cwp 

T. thermophilus 

25,000 


Adapted from Dekker et al., Appl. Microbiol. Biotechnol. 36:727-732,1992. 

The first row represents data for the original enzyme-producing strain. All the other strains are transformants carrying the T. thermophilus 
xylose/glucose isomerase gene on a multicopy plasmid. 
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TABLE 14.6 Catalytic efficiency of wild-type and mutant xylose/glucose isomerases 
from C. thermosulfurogen.es 


Amino acid change(s) 

Catalytic 

(min -1 

effiency (k c JK m ) 
mM" 1 ) toward: 


Glucose 

Xylose 

None (wild type) 

5.8 

97.2 

Trp-139 -» Phe 

15 

13.6 

Val-186 -> Thr 

9.7 

55.4 

Trp-139 Phe, Val-186 -> Thr 

32.9 

21.6 


Adapted from Meng et al., Proc. Natl. Acad. Sci. USA 88:4015-4019, 1991. 


the crop as a fermentation substrate to produce lactic and acetic acids. The 
resulting low pH restricts the growth and metabolic activity of other micro¬ 
organisms and ensures that the crop is preserved. This preservation strategy 
is called the making of silage. Often, the numbers of lactic acid bacteria that 
are found on a fresh crop are quite small, so that a bacterial inoculum, typi¬ 
cally Lactobacillus plantarum, must be added. Unfortunately, these bacterial 
inoculants are not especially effective when the amount of water-soluble 
carbohydrates in the fresh crop is insufficient to support both bacterial 
growth and lactic acid production. 

To develop a bacterium that might be effective in silage fermentation, 
an a-amylase gene from a strain of Lactobacillus amylovorus that does not 
support silage fermentation was spliced into the L. plantarum gene for con¬ 
jugated acid bile hydrolase (cbh) and integrated into the chromosomal 
DNA of a strain of L. plantarum (Fig. 14.21). The cbh gene is a dispensable 
gene when the bacterium is grown on silage; it encodes an enzyme that is 
active only when the bacterium is located in an animal's intestine. This 
work is an important first step in the development of L. plantarum strains 
that are more effective in the fermentation of silage from crops such as 
alfalfa, which contain a high level of starch. 

Isopropanol Production 

In order to decrease our reliance on nonrenewable petroleum products, it 
may be possible to engineer microorganisms to produce isopropanol from 
glucose (e.g., derived from starch). Isopropanol may be used directly as a 
fuel instead of methanol to esterify fats and oils to produce biodiesel, or it 
may be dehydrated to yield propylene, which is used to synthesize the 
polymer polypropylene. To engineer E. coli to produce isopropanol, the 
genes that were introduced into E. coli were based on the genes encoding 
the isopropanol biosynthesis pathway that exists in Clostridium beijerinckii 
(an organism that produces only moderate amounts of isopropanol but is 
difficult to grow and to manipulate genetically). The engineering of £. coli 
to produce isopropanol required the addition of four foreign genes (Fig. 
14.22). The initial source of all four genes was Clostridium acetobutylicum; 
however, genes encoding the same activities from several other bacteria 
were also tested in an effort to obtain a transformed strain of £. coli that 
produced the greatest amount of isopropanol. The best combination of 
foreign genes encoding enzymes in the isopropanol biosynthesis pathway 
produced nearly three times as much isopropanol as the best reported 
strain of C. beijerinckii, indicating that this strain has significant potential 
for use in the industrial production of isopropanol. 
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FIGURE 14.21 Chromosomal integration of an a-amylase gene into L. plantarum. The 
a-amylase gene is cloned into the L. amylovorus cbh gene on an E. coli-L. plantarum 
shuttle vector, which is then used to transform L. plantarum. Erythromycin-resistant 
and a-amylase-positive clones of L. plantarum result from a single crossover 
between the chromosomal and plasmid DNA at the cbh locus. After the growth of 
transformed L. plantarum for approximately 30 generations in the absence of selec¬ 
tive pressure, intrachromosomal recombination resulted in the excision of the 
erythromycin resistance (Erm r ) gene, the chromosomal copy of the cbh gene, and the 
plasmid DNA. The final engineered L. plantarum carries only an a-amylase gene 
and no selectable marker genes. 


Engineering Yeast Transcription 

Conventional mutagenesis and selection have historically been used to 
improve the useful behavior of a range of microorganisms. Most promi¬ 
nently, from a biotechnological perspective, this approach has been utilized 
to develop microbial strains that overproduce specific antibiotics. The 
advantage of the approach is that it does not predetermine which gene(s) 
will be altered. However, conventional mutagenesis is a slow and tedious 
process requiring an inordinate amount of testing of mutated strains. 
Moreover, conventional mutagenesis and selection, or even sequential 
rounds of random mutagenesis of an isolated DNA fragment, introduce 
only a limited number of changes at each round of mutagenesis. 
Unfortunately, sometimes changing a fundamental property of a microor¬ 
ganism may require altering the expression of dozens or even hundreds of 
genes. While this cannot be achieved using the above-mentioned approaches, 
the reprogramming of a microorganism may be realized by mutagenizing 
one of the proteins that is responsible for regulating global transcription of 
the microorganism. 
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FIGURE 14.22 Engineering E. coli to produce isopropanol; a simplified overview of 
the biosynthetic pathway. The introduced portion of the pathway is highlighted in 
yellow. The enzymes involved in this pathway are as follows: several E. coli enzy¬ 
matic steps are required for the conversion of glucose to pyruvate (A); pyruvate is 
converted to acetyl-CoAby an E. coli enzyme (B); acetyl-CoA acetyltransferase from 
C. acetobutylicwn catalyzes the formation of acetoacetyl-CoA (C); acetoacetyl-CoA 
transferase from E. coli catalyzes the production of acetoacetate (D); acetoacetate 
decarboxylase from C. acetoiutylicum catalyzes the synthesis of acetone (E); and 
secondary alcohol dehydrogenase from C. beijerinckii produces isopropanol from 
acetone (F). 


In order to engineer yeast to more efficiently produce alcohol, it is nec¬ 
essary that the yeast strain be able to tolerate high concentrations of both 
glucose and ethanol. To achieve this end, it is likely that the expression of 
a large number of yeast proteins must be altered. Moreover, it is by no 
means clear which proteins need to have their levels of expression either 
increased or decreased. Instead, one group of researchers undertook to 
reprogram a significant portion of yeast metabolism by generating a large 
number of randomly mutated yeast transcription factor genes. In a small 
number of cases, the modified transcription factor will alter yeast gene 
expression in a manner that increases the tolerance of yeast for high levels 
of both glucose and ethanol. In yeast, 15 different proteins bind to DNA 
and regulate the promoter specificity of RNA polymerase II. To dramati¬ 
cally modify yeast metabolism (Fig. 14.23), the gene encoding transcription 
factor SPT15 was altered by error-prone PCR (see chapter 8). This amplifi¬ 
cation reaction produces a range of different mutagenized SPT15 genes. All 
of the mutagenized SPT15 genes were cloned onto a plasmid vector and 
then introduced into wild-type yeast (S. cerevisiae) cells. The transformed 
cells were grown on agar medium in the presence of 6% ethanol and 120 g/ 
liter of glucose. Any transformant that grew better on this medium than the 
native yeast strain was then more fully characterized. One particular 
mutant displayed a higher level of cell viability and grew faster than the 
native yeast strain at ethanol concentrations of between 10 and 20%. When 
this strain was examined in detail, it was found that the SPT15 gene had 
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FIGURE 14.23 Engineering the yeast S. cerevisiae to tolerate high levels of glucose and 
ethanol. The yeast transcription factor SPT15 gene was mutagenized by error-prone 
PCR, cloned into a plasmid vector, and used to transform a laboratory yeast strain. 
The transformants were selected for the ability to grow on medium containing both 
6% ethanol and 12% glucose. The portions of the SPT15 gene shown in red repre¬ 
sent the introduced mutations. 


three separate mutations, all of which were required for this activity. In 
addition, when the mutant was transcriptionally profiled (using microarray 
technology), several hundred genes were found to be differentially regu¬ 
lated compared to the native strain, with the majority of genes being 
up regulated. Interestingly, a detailed analysis of the genes whose expres¬ 
sion was significantly altered did not reveal that a particular pathway or 
genetic network was primarily responsible for the observed reprogram¬ 
ming. Finally, the yeast strain that was used in these experiments was a 
standard laboratory strain, so that to incorporate this approach into a com¬ 
mercial process, it will be necessary to repeat the work with an industrial 
strain of yeast. If successful, this type of genetic manipulation could facili¬ 
tate the development of strains of yeast that more efficiently convert glu¬ 
cose to ethanol. 


Utilization of Cellulose 

With an increasing world population and dramatic increases in the stan¬ 
dard of living in parts of the developing world, meeting the growing 
worldwide demand for energy for heating, transportation, and industry 
has become a major challenge for all countries for the 21st century and 
beyond. Moreover, in addition to being sustainable, the future energy 
supply needs to be nonpolluting so that we may realize a reduction in 
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TABLE 14.7 Typical compositions of various lignocellulosic materials 


Raw material 


Amount (%) of: 


Lignin 

Cellulose 

Hemicellulose 

Pine wood 

27.8 

44.0 

26.0 

Birch wood 

19.5 

40.0 

39.0 

Sugarcane bagasse 

18.9 

33.4 

30.0 

Rice straw 

12.5 

32.1 

24.0 

Cotton 

12.5 

32.1 

24.0 


Adapted from Brown, Philos. Trans. R. Soc. Lond. B 300:305-322,1983. 


greenhouse gas emissions. To this end, many countries around the world 
have begun to produce large amounts of alternative fuels in an effort to at 
least partially replace nonrenewable fossil fuels, such as oil and gas. In this 
regard, a major effort has been directed toward producing bioethanol. At 
the present time, Brazil produces large amounts of ethanol from the fer¬ 
mentation of sucrose derived from sugarcane, and the United States pro¬ 
duces ethanol from corn starch. However, the reduction of greenhouse 
gases that results from the use of sugar- or starch-based ethanol is not as 
high as desired, and many socially conscious individuals have criticized 
the strategy of converting land from the production of food to the produc¬ 
tion of ethanol/energy. Moreover, sugar- or starch-based ethanol is unlikely 
to provide more than a small fraction of what we require. In this regard, in 
2008, the Chinese government announced that it would not allow any fur¬ 
ther increase in starch-based ethanol production because of competing uses 
as food. If the world is going to produce ethanol on a large enough scale to 
significantly lower our use of fossil fuels, that ethanol will have to be pro¬ 
duced from lignocellulosic waste products, such as corn stover, grasses, 
and wood chips. Thus, there is now, more than ever before, a tremendous 
amount of both political and scientific activity directed toward trying to 
produce ethanol from lignocellulosic materials. 

Lignocellulosics 

The polymers lignin, hemicellulose, and cellulose combine in various pro¬ 
portions to form a "lignocellulosic" structural support system for nearly all 
terrestrial plants (Table 14.7). This material constitutes a vast biomass that 
is often a waste product of agriculture, timber processing, and other human 
activity and needs to be disposed of in a safe and efficient manner or used 
as a resource. It has been estimated that annually -10 11 tons of these poly¬ 
mers is synthesized in the biosphere, with an energy content that is equiv¬ 
alent to around 640 billion tons of oil. 

Lignocellulosic materials (cellulosics) have been grouped into three 
classes. 

• Primary cellulosics include plants that are harvested specifically for 
cellulosic content, structural use, or feed value, e.g., cotton, timber, 
and hay. 

• Agricultural waste cellulosics are the plant materials that remain 
after harvesting and processing, e.g., straw, corn stovers, rice hulls, 
sugarcane bagasse, animal manures, and timber residues. 

• Municipal waste cellulosics encompass wastepaper and other dis¬ 
carded paper products. 
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FIGURE 14.24 Schematic representation of lignin structure, showing some of the 
various possible linkages between the phenylpropane (a C 6 aromatic group attached 
to a C 3 alkyl chain) units. The phenylpropane units are linked in an unorganized, 
nonrepeating fashion. 


Components of Lignocellulose 

Lignin is a three-dimensional, globular, irregular, insoluble, high-molecular- 
weight (>10,000) polymer made up of phenylpropane subunits with no 
chains of regular repeating units or any bonds that are easily hydrolyzed 
either enzymatically or chemically (Fig. 14.24). The lignin polymer molecule 

FIGURE 14.25 Structure of a portion of a cellulose chain. Glucose residues are joined 
head to tail by (3-1,4 linkages. 
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has many different types of chemical linkages between aromatic phenylpro- 
pane units. The physical and chemical characteristics of lignin are generally 
attributed to the last step in lignin biosynthesis, the nonenzymatic free- 
radical-based joining of the phenylpropane units in a more or less random 
fashion. In plants, lignin is chemically bonded to hemicellulose and wraps 
around fibers composed of cellulose. Lignin is responsible for the rigidity of 
plants and for their resistance to mechanical stress and microbial attack. 

Hemicelluloses are short-chain, heterogeneous polymers that contain 
both hexoses (six-carbon sugars, such as glucose, mannose, and galactose) 
and pentoses (five-carbon sugars, such as xylose and arabinose). The three 
major types of hemicelluloses are xylans, which have a backbone of poly- 
(3-1,4-xylan, with side links to arabinose, glucuronic acid, and arabino- 
glucuronic acid; mannans, which are composed of glucomannans and 
galactomannans; and arabinogalactans. The origin of the lignocellulosic 
material usually defines the nature of the hemicelluloses. For example, 
xylan hemicellulose is particularly common in hardwoods, and glucoman¬ 
nans are characteristic of softwoods. 

Cellulose, which is the simplest of the components found in lignocel¬ 
lulosic material, is the most abundant polymer in the biosphere. It is com¬ 
posed of long chains of D-glucose molecules linked in (3-1,4 configuration 
(Fig. 14.25). Both cellulose and starch can be hydrolyzed to glucose, but 
their structures are very different. Starch is an energy storage molecule in 
which the glucose residues are linked in a manner that prevents a tightly 
ordered arrangement of the polymer chains. This open mesh-like structure 
is easily penetrated by water; as a result, starch is both water soluble and 
readily hydrolyzable by amylases and glucoamylases. By contrast, cellu¬ 
lose is a plant-supporting structural molecule. The glucose chains in cellu¬ 
lose are arranged in a manner that permits them to pack together in a 
crystal-like structure that is impervious to water. Consequently, the cellu¬ 
lose polymer is both insoluble and resistant to hydrolysis. 

Nevertheless, cellulose is still a form of stored glucose, so it is the com¬ 
ponent of lignocellulosics that has the most potential for conversion into a 
variety of useful compounds, such as alcohol. However, before cellulose 
can be utilized, it must be released from its complex with lignin and hemi¬ 
cellulose. For most lignocellulosic materials, this separation requires treat¬ 
ment with either a strong acid or a strong base or the use of high 
temperature and pressure. Regardless of how the cellulose is separated 
from the lignocellulose complex, the energy that is necessary to achieve this 
adds significantly to the cost of the final product. However, since the 
annual production of lignocellulosic materials is huge, effective ways of 
enzymatically degrading cellulose and hemicellulose are being sought. 
Chemical and enzymatic methods for the selective degradation of lignin 
also are being investigated, with less success. 


Isolation of Prokaryotic Cellulase Genes 

A wide range of bacteria and fungi are naturally capable of degrading cel¬ 
lulose through the concerted action of several enzymes that collectively are 
referred to as cellulase. Aerobic microorganisms typically secrete large 
amounts of these cellulase enzymes into the medium outside of the cell. On 
the other hand, in anaerobic microorganisms, cellulase activity is often 
found as part of a multiprotein complex that is called a cellulosome (Fig. 
14.26) that lies on the external surface of the cell (Fig. 14.27). Cellulases 
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FIGURE 14.26 Schematic representation 
of a cellulosome protein complex. This 
complex contains both structural and 
enzymatically active protein compo¬ 
nents. The structural components 
include scaffoldin, a scaffolding protein 
(shown in yellow) that contains a strong 
cellulose-binding module (orange), and 
a number of cohesin molecules (nine 
are shown). The cohesin molecules act 
as binding sites (docking sites) for the 
dockerin proteins (red), which are 
attached to the catalytic domains (the 
active enzymatic components, which 
are shown in green, purple, and blue). 
Each cellulosome complex is anchored 
to the microbial cell surface through 
the cell surface-binding domain 
(brown). 
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FIGURE 14.27 Interaction of a cellulolytic 
bacterium with cellulose. The bacterial 
cell surface has protuberance-like struc¬ 
tures that contain multiple copies of the 
cellulosome (polycellulosome protu¬ 
berances). On contact with the cellulose 
substrate, some of the polycellulosome 
protuberances protract dramatically 
and deposit their cellulosomes along 
the surface of the cellulose (detached 
from the bacterial surface). 


consist of multiple copies of several enzymes with different enzymatic 
activities, including the following. 

• Endoglucanase, which hydrolyzes (3-1,4 linkages between adjacent 
glucose molecules within the amorphous (loosely packed) regions of 
the cellulose polymer, thereby breaking the chain in the middle 

• Exoglucanase, which degrades the nicked cellulose chains from 
their nonreducing ends and produces glucose, cellobiose (two glu¬ 
cose units), and cellotriose (three glucose units) 

• Cellobiohydrolase, which is often found in cellulolytic fungi and is 
a type of exoglucanase that removes units of 10 or more glucose 
residues from the nonreducing ends of the cellulose molecule 

• (3-Glucosidase, or cellobiase, which converts cellobiose and cel¬ 
lotriose to glucose 

The breakdown of cellulose by microorganisms (either bacteria or 
fungi) that produce the various components of cellulase (Fig. 14.28) is slow 
and often incomplete. Therefore, genetic engineering strategies have been 
used in an attempt to create organisms with more effective cellulase 
activity. For this purpose, genes coding for the individual enzymatic func¬ 
tions of cellulase activity have been isolated from both prokaryotic and 
eukaryotic organisms. 

Prokaryotic endoglucanase genes have been cloned by the following 
simple yet effective identification technique. 

1. A clone bank of DNA from a cellulolytic prokaryote is constructed 
in E. coli, and the host cells are grown overnight on solid medium 
containing a selective antibiotic. 

2. The colonies are then overlaid with agar containing carboxymethyl 
cellulose (CMC), a soluble derivative of cellulose, and the petri 
plates are incubated at 37°C for several hours. During this time, the 
CMC molecules that are present in the immediate vicinity of a 
colony that both synthesizes and secretes an endoglucanase are 
partially digested. Transformants that synthesize but do not secrete 
the cloned endoglucanase are not able to degrade the substrate, 
because it is too large to enter the cell. 

3. The digested regions of the CMC are visualized by first flooding 
the petri plate with a solution of the dye Congo red, which is not 
toxic to the bacteria, followed by a wash with a solution of sodium 
chloride. Congo red selectively binds to high-molecular-weight 
cellulose chains and gives a red color; conversely, it binds weakly 
to low-molecular-weight polysaccharides and produces a yellow 
hue. The sodium chloride treatment stabilizes the binding of the 
dye. If a bacterial colony produces a secreted endoglucanase, it will 
be surrounded by a yellow halo; the background, where the CMC 
has not been degraded, will be red (Fig. 14.29). 

This technique has been successfully used in the isolation of endoglu¬ 
canase genes from Streptomyces, Clostridium, Thermo anaerobncter, 
Thermomonospora, Erivmia, Pseudomonas, Cellvibrio, Ruminococcus, 
Cellulomonas, Fibrobacter, and Bacillus species. 

There is no convenient plate assay for detecting cells with a cloned exo¬ 
glucanase gene, so immunological screening has been used to pick out the 
recombinant clones that express exoglucanase. Although this approach 
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FIGURE 14.28 Enzymatic biodegradation of cellulose. Cellulose hydrolysis begins 
with the cleavage of (3-1,4 linkages within the accessible amorphous regions of the 
cellulose chains by endo-glucanase(s). This reaction is followed by the removal of 
oligosaccharides from the partially cleaved cellulose chains by exoglucanase(s) and 
cellobiohydrolase(s). The degradation of cellulose is completed when the cellobiose 
and cellotriose are converted to glucose by p-glucosidase. 


requires specific antibodies directed against the target protein, the protein 
does not have to be secreted to be detected. Recombinant cells can be lysed 
in situ, e.g., by exposure to chloroform vapor, before the cytoplasmic pro¬ 
teins are transferred to a nylon or nitrocellulose membrane for subsequent 
immunological testing. In these tests, replica plates are used to ensure that 
viable cells are available for further propagation and use. 

Prokaryotic p-glucosidase genes have been isolated by transforming a 
clone bank from a p-glucosidase-producing microorganism into E. coli and 
then selecting for transformants that can grow on minimal media with cel¬ 
lobiose as the sole carbon source. Alternatively, clones that express 
p-glucosidase activity can be detected with a chromogenic substrate, such 
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FIGURE 14.29 Detection of an E. coli 
clone expressing a bacterial endogluca- 
nase gene. The yellow halo indicates 
the presence of a positive clone that 
degrades the soluble cellulose (CMC) 
in the medium in the vicinity of the 
clone via the secreted endoglucanase. 


as 5-bromo-4-chloro-3-indolyl-p-D-glucopyranoside (BCIP), in the plating 
medium or with MacConkey-cellobiose agar. In both cases, (3-glucosidase- 
positive colonies turn red. 

Isolation of Eukaryotic Cellulase Genes 

The strategy of DNA hybridization screening of either cDNA or genomic 
clone banks with a heterologous probe has not been particularly effective for 
isolating cellulase genes, because the sequences of cellulolytic enzymes from 
diverse sources are not very similar. To isolate the messenger RNAs (mRNAs) 
that encode cellulolytic enzymes from fungi or plants, a novel protocol had 
to be implemented. Unfortunately, these mRNAs usually constitute only a 
small fraction of the total mRNA population. Therefore, it is often necessary 
to enrich for the target mRNA or cDNA and to eliminate cDNA clones that 
do not carry the target sequences. To meet these ends, the technique of "dif¬ 
ferential hybridization" has been used for the isolation of a number of dif¬ 
ferent induced eukaryotic cellulase genes, as follows (Fig. 14.30). 

1. mRNA is isolated both from cells grown without cellulose (i.e., 
noninduced cells) and from cells grown in the presence of cellulose 
to enhance the synthesis of cellulase enzymes (i.e., induced cells). 

2. Each mRNA population is fractionated on a sucrose gradient, and 
each fraction is translated in a cell-free system, either rabbit reticu¬ 
locytes or wheat germ. A cellulose-induced protein(s) is identified, 
after separation of the cell-free translation products on a polyacryl¬ 
amide gel, by the presence of unique bands that appear from the 
induced cells but not from the noninduced cells (Fig. 14.31). This 
step indicates which mRNA fractions contain messengers that are 
induced by the addition of cellulose. 

3. The mRNA sucrose gradient fractions from cells that direct the 
synthesis of cellulose-induced proteins and the comparable frac¬ 
tions from the noninduced cells are used separately to program the 
synthesis of cDNA. 

4. The cDNA sample from the induced-cell population is cloned into 
a plasmid vector, introduced into E. coli, replica plated, and then 
separately screened with labeled cDNA from both the induced and 
noninduced fractions as hybridization probes. Clones that hybridize 
only with the cDNA from the induced cells and not with cDNA 
from the noninduced cells potentially carry cellulose-induced 
genes and are characterized further. 

5. To establish conclusively which of the positive cDNA clones 
encode cellulase enzymes, the DNA from these clones is intro¬ 
duced into an £. coli expression vector and, after E. coli is trans¬ 
formed with these constructs, the protein products are detected 
with antibodies to the enzymes of the cellulase complex. 

6. The sequence of each of the positive cDNA clones is determined. In 
principle, this scheme can be used for the isolation of any induced 
eukaryotic gene(s). 


Manipulation of Cellulase Genes 

There are a variety of uses for cloned cellulase genes. In some cases, the 
cellulose-binding domain encoded by cellulase genes facilitates the purifi¬ 
cation of recombinant proteins. In other instances, the cellulolytic activity 
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FIGURE 14.30 Differential hybridization for isolating eukaryotic cellulase cDNA 
clones. 


is expressed in organisms that can convert waste cellulose into a commer¬ 
cial product, such as alcohol. 

Many cellulase enzymes have three separate domains: a catalytic region; 
a hinge region that is often rich in proline, serine, or threonine residues; and 
a cellulose-binding region. The catalytic and binding domains act indepen¬ 
dently. This separation of functions can be exploited by cloning the DNA 
sequence that encodes the cellulose-binding domain as part of a fusion 
gene, where the other portion of the gene encodes a commercial protein. 
After expression of this fusion protein, it can be purified by passing a crude 
extract through a column packed with cellulose. Under these conditions, 
only the fusion protein will bind to the cellulose. Then the fusion protein, in 
homogeneous form, can be eluted from the column. The commercial protein 
can be retrieved by removal of the cellulose-binding domain by proteolytic 
cleavage. This system is similar in principle to immunoaffinity chromatog¬ 
raphy, except that it should be less expensive than using antibodies. 

For convenience, most cellulase genes are initially cloned and expressed 
in E. coli, but other useful microorganisms might be developed by the intro¬ 
duction of cellulase genes. For example, S. cerevisiae and Zi/momonas mobilis, 
which both efficiently convert simple sugars, such as glucose, into alcohol, 
have been used as hosts for the expression of cellulase genes. The idea 
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FIGURE 14.31 Schematic representation of stained polyacrylamide gels of translation 
products of mRNA fractions, i.e., proteins, following growth of cells either with or 
without cellulose. The numbers represent the fraction number following sucrose 
gradient centrifugation of mRNA. The red ovals on the gel without cellulose indi¬ 
cate where bands appear when there is growth in the presence of cellulose. 


behind these studies was to test whether the presence of cellulase activity 
would enable these organisms to convert cellulose directly to alcohol. 

In one series of experiments, endoglucanase and exoglucanase genes 
from the bacterium Cellulomonas fimi were each put under the control of an 
S. cerevisiae promoter and signal peptide sequence, subcloned onto the 
same plasmid vector, and introduced into S. cerevisiae. Some transformants 
secreted about 70% of each activity into the growth medium and were able 
to degrade the cellulose in filter paper and pretreated wood chips. The rate 
and extent of hydrolysis of both of these substrates were increased by 
adding (3-glucosidase to the mixture, thereby decreasing the amount of cel- 
lobiose that accumulated and preventing end-product (feedback) inhibi¬ 
tion of the endoglucanase and exoglucanase activities by cellobiose (Fig. 
14.32). The role of (3-glucosidase in the cellulolytic process has been exam¬ 
ined in more detail. Cellobiose acts as a feedback inhibitor of cellulose 
hydrolysis, and glucose inhibits cellobiose cleavage. These two regulatory 
mechanisms may prevent complete enzymatic breakdown of cellulose. 
Instead of adding (3-glucosidase to the medium, a (3-glucosidase gene could 
be cloned into the host cell. To this end, a (3-glucosidase gene was isolated 
from the cellulolytic fungus Trichoderma reesei, cloned onto a multiple-copy 
plasmid, and reintroduced into T. reesei. The transformant strain overpro¬ 
duced (3-glucosidase activity 5.5-fold, and it degraded microcrystalline 
cellulose (Avicel), a cellulose derivative, 33% faster than the nontrans- 
formed strain. In addition, when a (3-glucosidase gene from the yeast 
Saccharomycopsis fibidigera was expressed in S. cerevisiae, the transformed 
strain directed the enzyme to the periplasm. The (3-glucosidase-producing 
S. cerevisiae strain was nearly as efficient at utilizing cellobiose to produce 
ethanol as the nontransformed S. cerevisiae strain was at producing alcohol 
from glucose. Thus, the presence of (3-glucosidase genes enhances the enzy- 
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matic utilization of cellulose and suggests a simple strategy for genetically 
engineering more effective cellulolytic alcohol-producing microorganisms. 

In addition to converting cellulosic wastes into useful materials, endo- 
glucanase genes may have some novel uses. For example, when a wine¬ 
making yeast was transformed with an endoglucanase gene under the 
control of the constitutively expressed yeast actin gene promoter, the wine 
that was produced had an increased fruity aroma. This improvement was 
attributed to an increase in the amounts of at least 12 different volatile com¬ 
pounds, including ethyl propionate, 2-butanol, isoamyl acetate, isoamyl 
alcohol, and isobutyric acid. This type of genetic modification opens the 
possibility of engineering yeast strains that yield wines with particular 
desirable characteristics. 

The participation of cellulase enzymes in an industrial process for the 
bioconversion of wastepaper to alcohol has also been examined. Wastepaper 
was partially digested by the addition of cellulase enzymes at 45°C; then, 
the released glucose was fermented by S. cerevisiae at 37°C. By extrapola¬ 
tion of small-scale results, yields of 400 liters of ethanol per ton of waste- 
paper were estimated. If all 100 million tons of wastepaper generated 
annually in North America were converted into ethanol and used as fuel, 
approximately 16% of the gasoline that is currently being used in North 
America could be saved. 

Designer cellulosomes. Scientists hoping to exploit the activity of micro¬ 
bial cellulosomes to degrade cellulosic waste materials have attempted to 
manipulate some of the genes involved in the formation of this complex 
and create designer cellulosomes whose degradative activities are directed 
toward specific substrates. One of the key ingredients in the assembly and 
functioning of a cellulosome is the calcium-dependent high affinity (~10 9 
M ') of the cohesin domain of the scaffoldin molecule for the dockerin 
domain (Fig. 14.26). Unfortunately, within a given species of microor¬ 
ganism, the dockerin domain binds to all of the cohesin domains with the 
same affinity. Thus, in nature, different catalytic domains attached to the 
dockerins are randomly incorporated into the cellulosome. However, it is 
possible, in the laboratory, to engineer cellulosomes that contain certain 
enzymatic activities designed to facilitate the degradation of specific sub¬ 
strates. In one series of experiments, designer cellulosomes were con¬ 
structed that were more effective than the free enzymes (i.e., the enzymatic 
components not assembled into a cellulosome complex) and slightly less 
effective than native cellulosomes in degrading crystalline cellulose. 
However, these designer cellulosomes were not especially active at 
degrading straw (an agricultural waste product). When a xylanase gene 
was incorporated into the cellulosome complex, the ability of the designer 
cellulosome to degrade straw, which contains hemicellulose as well as cel¬ 
lulose, increased significantly (Fig. 14.33). Although the designer cellulo¬ 
somes that have so far been constructed are not yet ready to be used in a 
commercial process, it is envisioned that these molecules could one day be 
the central component of a very large industry aimed at efficiently and 
economically converting cellulosic wastes into useable chemicals. 

Zymomonas mobilis 

Although industrial fermentations that produce alcohol are performed 
almost exclusively with S. cerevisiae, the bacterium Z. mobilis is a potentially 
useful organism for this purpose. Zymomonas is a gram-negative, rod- 


Cellulose 



FIGURE 14.32 Metabolic control of cel¬ 
lulose hydrolysis. The green arrows 
indicate cellulose degradation. The red 
arrows indicate feedback inhibition by 
a particular metabolite of cellulose 
degradation. Relief of feedback inhibi¬ 
tion may come from overproducing 
(3-glucosidase, thereby decreasing the 
concentration of cellobiose, and 
removal of glucose by converting it 
into other metabolites. 
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FIGURE 14.33 Digestion of straw by a cellulase-plus-xylanase designer cellulosome 
complex free cellulase enzyme, a cellulase-based designer cellulosome, and free 
xylanase enzyme. Although the data are presented as the relative amount of sub¬ 
strate degraded, in this experiment, the substrate was only partially degraded in all 
cases. 


shaped organism that can ferment glucose, fructose, and sucrose and pro¬ 
duce a relatively high yield of alcohol (Table 14.8). This high yield of ethanol 
is probably related to the fact that Zymomonas does not proliferate exten¬ 
sively (i.e., produce biomass) during fermentation. Since Zymomonas uses 
less substrate for biomass formation, more is available for ethanol produc¬ 
tion. In this regard, yeast produces 2 mol of ATP per mol of glucose, whereas 
Zymomonas uses a different pathway and produces only 1 mol of ATP per 
mol of glucose. Historically, Zymomonas has been used in tropical regions as 
a fermentative agent for the production of alcoholic beverages. 

Zymomonas produces alcohol at a much higher rate than does S. cerevi- 
siae, even though the organisms are similar in other features (Table 14.8). 


TABLE 14.8 Comparison of Z. mobilis and S. cerevisiae as alcohol producers 


Attribute 

Value for: 

Z. mobilis 

Yeast 

Conversion of sugar to ethanol (%) 

96 

96 

Maximum ethanol concentration (%) 

12 

12 

Ethanol productivity rate (g g _1 fr 1 ) 

5.67 

0.67 

Volumetric ethanol productivity rate (g liter 1 fr 1 ) 

200 

29 

Sugar tolerance (%) 

>40 

>40 

pH range for ethanol production 

3.5-7.5 

2-6.5 

Optimum temperature (°C) 

25-30 

30-38 


Adapted from Buchholz et al.. Trends Biotechnol 5:199-204,1987. 

The ethanol productivity rate was measured under batch fermentation conditions. The volumetric 
ethanol productivity rate was measured during continuous culture. Both strains yielded the same max¬ 
imum ethanol concentration (12%) and had the same sugar tolerance (>40%). 
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However, there are biological and technical constraints that prevent 
Zymomonas from being used more widely for alcohol production. First, it 
can use only a limited number of carbon substrates for the production of 
alcohol. Second, broad-host-range cloning vectors and, as a consequence, 
foreign genes are difficult to maintain in this organism. Third, Zymomonas 
is naturally resistant to many of the more commonly used antibiotics, 
which precludes using the standard antibiotic resistance marker systems 
for cloning experiments. 

Despite these difficulties, a number of foreign genes have been success¬ 
fully introduced into and expressed in Zymomonas. Many of these experi¬ 
ments have focused on expanding the range of substrates that Zymomonas 
can utilize. For example, genes encoding enzymes that hydrolyze lactose, 
starch, cellulose, xylose, and cellobiose have all been introduced into 
Zymomonas (Table 14.9). Transformants were able to express all of these 
genes to some extent. However, in most of these cases, the transformed 
bacterium was unable to utilize the novel substrate as the sole carbon 
source. 

In early studies directed toward developing strains of Z. mobilis that 
were capable of growth and ethanol production with xylose as a substrate, 
the bacterium was transformed with genes encoding the xylose utilization 
enzymes glucose/xylose isomerase and xylulokinase. However, these 
transformants were limited by their inability to further metabolize the pen¬ 
toses (xylulose-5-phosphate, ribulose-5-phosphate, and ribose-5-phos- 
phate) that are formed after xylose is assimilated (Fig. 14.34). To remedy 
this situation, Z. mobilis was transformed with a plasmid carrying two syn¬ 
thetic operons, one with two xylose assimilation genes and one with two 
pentose metabolism genes (Fig. 14.35). The pentose metabolism genes 


TABLE 14.9 Some of the heterologous genes expressed in Z. mobilis 


Enzyme encoded 


Enzyme function 


a-Amylase 

Endo-l,4-|3-D-glucanase 

P-D-Glucosidase 

CMC 

a-D-Calactosidase 

Lac permease 

P-D-Galactosidase 

Glucoamylase 

Xylose isomerase 

Xyulokinase 

Xylose permease 

Transaldolase 


Phosphomannose isomerase 
L-Arabinose isomerase 
L-Ribulokinase 

L-Ribulose-phosphate-4-epimerase 

Transketolase 


Breakdown of starch to dextrins and glucose 
Breakdown of cellulose chains 
Breakdown of cellobiose to glucose 
Breakdown of soluble cellulose 

Breakdown of raffinose, stachyose, and verbascose into glucose, 
galactose, sucrose, and fructose 
Facilitates transport of lactose into bacterial cells 
Breakdown of lactose 

Breakdown of starch and dextrins to glucose 
Conversion of xylose to xylulose 
Conversion of L-xylulose to L-xylulose 5-phosphate 
Facilitates transport of xylose into bacterial cells 
Conversion of sedoheptulose 7-phosphate and D-glyceraldehyde 
3-phosphate to yield D-erythrose 4-phosphate and D-fructose 
phosphate 

Conversion of D-mannose 6-phosphate to D-fructose 6-phosphate 
Conversion of L-arabinose to L-ribulose 
Conversion of L-ribulose to L-ribulose 5-phosphate 
Conversion of L-ribulose 5-phosphate to D-xylose 5-phosphate 
Conversion of D-xylulose-5-phosphate (1) to sedoheptulose-7-phosphate 
and glyceraldehyde-3-phosphate and (2) to fructose-6-phosphate and 
glyceraldehyde-3-phosphate 
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FIGURE 14.34 Schematic representation of the engineered assimilation of either 
xylose or arabinose by Z. mobilis and the engineered conversion of the resultant 
pentoses (shown in the green box) to ethanol. 


encode the enzymes transketolase and transaldolase; both were placed 
under the control of the Z. mobilis enolase promoter. The xylose assimila¬ 
tion genes were placed under the transcriptional control of a strong consti¬ 
tutive promoter from the Z. mobilis gene for glyceraldehyde-3-phosphate 
dehydrogenase. Both constructs were cloned onto an E. coli-Z. mobilis 
shuttle vector, which was then used to transform Z. mobilis. As expected, 
the transformants assimilated xylose and converted the resulting pentoses 
that formed to fructose-6-phosphate and glyceraldehyde-3-phosphate, 
which, in turn, were readily converted to ethanol by the Entner-Doudoroff 
pathway of Z. mobilis. Moreover, the transformants grew efficiently on 
either glucose or xylose, as well as on glucose-xylose mixtures, and con- 
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FIGURE 14.35 A Zymomonas-E. coli shuttle vector carrying one operon with genes 
encoding enzymes used for xylose assimilation (xylA and xylB) and another with 
genes encoding enzymes involved in pentose metabolism (tktA and talB). p m °, eno- 
lase promoter; p CT ’, glyceraldehyde-3-phosphate dehydrogenase promoter; xylA, 
xylose isomerase gene; xylB, xylulokinase gene; tktA, transketolase gene; talB, 
transaldolase gene; Tet r , tetracycline resistance gene; oriE, E. coli origin of replica¬ 
tion. The Zymomonas DNA contains a Zymomonas origin of replication. 


verted xylose to ethanol at high yield. This work demonstrates the feasi¬ 
bility of metabolically engineering Z. mobilis as an ethanol producer by 
using xylose, a waste material produced as a by-product of industrial pro¬ 
cesses, such as pulp and paper making, as a carbon source. 

Xylose is the predominant pentose sugar in hardwoods, while arabi- 
nose (Fig. 14.36) is present in large amounts in various agricultural and 
other herbaceous plants. Some arabinose-containing plants, such as switch- 
grass, have been considered for use as dedicated energy crops, i.e., plants 
grown solely for use as sources of energy. Thus, it would be very useful if, 
in addition to a strain of Z. mobilis that can convert xylose to ethanol, an 
arabinose-fermenting Z. mobilis strain were also available. To develop such 
a strain, the arabinose assimilation genes L-ribulokinase, L-arabinose 
isomerase, and L-ribulose-5-phosphate-4-epimerase from £. coli were iso¬ 
lated and put under the transcriptional control of the constitutive Z. mobilis 
glyceraldehyde-3-phosphate dehydrogenase promoter (Fig. 14.37). 
Following expression of these genes, transformants produced the pentoses 
xylulose-5-phosphate, ribulose-5-phosphate, and ribose-5-phosphate. The 
plasmid used to transform Z. mobilis also contained two pentose metabo¬ 
lism genes encoding the enzymes transketolase and transaldolase under the 
transcriptional control of the constitutive Z. mobilis enolase promoter. The 
expression of these two genes catalyzed the conversion of the above-men¬ 
tioned pentoses to fructose-6-phosphate and glyceraldehyde-3-phosphate. 

The strategy that was used for this work was nearly identical to the 
strategy that was employed in the development of Z. mobilis strains able to 
utilize xylose as a carbon source. However, for Z. mobilis to utilize arabi¬ 
nose as a carbon source, arabinose rather than xylose assimilation genes 
were used. The metabolites that are produced starting with either xylose or 
arabinose are converted to ethanol by the Entner-Doudoroff pathway of Z. 
mobilis (Fig. 14.34). Moreover, the xylose- and arabinose-fermenting strains 
of Z. mobilis might be used together in a mixed bacterial culture for the 
conversion of the major sugars from certain agricultural residues into eth¬ 
anol. To efficiently and economically convert lignocellulosic wastes to eth¬ 
anol, it is necessary to convert all of the sugars, the pentoses as well as the 
hexoses, to ethanol. Z. mobilis lacks the pathways needed for the metabo¬ 
lism of mannose and galactose, which constitute a significant fraction of the 


FIGURE 14.36 Chemical structures of 
D-xylose and L-arabinose. 
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FIGURE 14.37 A Zymomonas-E. coli shuttle vector carrying one operon with genes 
encoding enzymes used for arabinose assimilation (araB, araA, and araD) and 
another with genes encoding enzymes involved in pentose metabolism (tktA and 
talB). p m °, enolase promoter; p^f, glyceraldehyde-3-phosphate dehydrogenase pro¬ 
moter; araB, L-ribulokinase gene; araA, L-arabinose isomerase gene; araD, L-ribulose- 
5-phosphate-4-epimerase gene; tktA, transketolase gene; talB, transaldolase gene; 
Tet r , tetracycline resistance gene; oriE, E. coli origin of replication. The Zymomonas 
DNA contains a Zymomonas origin of replication. 


hexoses in lignocellulosic material. Whether Z. mobilis can be engineered to 
be the organism of choice as part of a process of this sort is still an open 
question. 

In addition to genetically engineering Z. mobilis to efficiently utilize 
xylose, arabinose, and glucose, researchers have utilized similar genetic 
approaches in an attempt to modify other microorganisms to utilize sugars 
derived from lignocellulosic materials to produce ethanol. Most of these 
efforts have been directed toward modifying S. cerevisiae, with some 
researchers attempting to engineer some strains of P. putida. 

Instead of genetically engineering Z. mobilis to utilize xylose and ara¬ 
binose and convert them into ethanol, some scientists have engineered E. 
coli to express some Z. mobilis genes so that it can produce ethanol. In this 
case, xylose and arabinose are converted to pyruvate by endogenous E. coli 
enzymes. Workers have mutated different E. coli genes to prevent the pyru¬ 
vate from being converted into other, unwanted metabolites. Instead, the 
pyruvate is converted to ethanol by the enzymes pyruvate decarboxylase 
and alcohol dehydrogenase, with the genes for both of these enzymes 
coming from Z. mobilis. This sort of genetic modification has also been used 
to change the bacterium Klebsiella oxytoca into an ethanologenic organism. 
With both £. coli and K. oxytoca, the introduction of Z. mobilis genes yielded 
recombinant bacteria that were quite efficient in the laboratory at con¬ 
verting various sugars, both pentoses and hexoses, into alcohol. It now 
remains to be demonstrated whether any of these bacteria are effective on 
a large scale with an industrial substrate. 

Since several naturally occurring yeast strains can utilize the range of 
sugars found in lignocellulosic materials, considerable effort has been 
directed to improving the performance of these strains. However, in con¬ 
trast to the well-studied laboratory yeast strains, industrial strains are usu¬ 
ally diploid or polyploid (and therefore not as easy to engineer), and their 
genetics are not especially well characterized or understood. Nevertheless, 
some industrial yeast strains that are able to tolerate the inhibitory com¬ 
pounds found in hydrolyzed lignocellulosic materials and ferment both 
hexoses and xyloses are being developed, and some researchers believe 
that within the next few years one or more of these organisms could 
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FIGURE 14.38 The major components of the E. coli formate hydrogen lyase. The 
purple arrows show the direction of the electron flux and the proteins through 
which the electrons flow (shown in blue). The Hyc proteins are the components of 
the formate hydrogen lyase system, which accept electrons from FDH-H (the large 
subunit of formate dehydrogenase). HycB is the small subunit of formate dehydro¬ 
genase and acts as the membrane docking target of FDH-H. HycC and HycD 
(shown in yellow) are integral membrane proteins and are thought to anchor the 
rest of the complex. HycE is the large subunit of hydrogenase 3, and HycG is the 
small subunit of hydrogenase 3. HycF is an electron transport protein. 


become the cornerstone of an industrial process to convert lignocellulosic 
residues into ethanol. 


Hydrogen Production 

It has been known for some time that formic acid (formate) can be pro¬ 
duced inexpensively, often as a by-product of the synthesis of other chemi¬ 
cals, such as acetic acid. Moreover, some types of bacteria are able to 
convert formate into hydrogen and carbon dioxide via the formate 
hydrogen lyase system. If the bacterial system that is responsible for 
hydrogen synthesis could be optimized, it might be possible to develop a 
practical system for the synthesis of hydrogen from biomass. 

The E. coli formate hydrogen lyase system consists of a large number of 
different proteins, including those shown in Fig. 14.38, as well as others 
that specifically regulate the synthesis and maturation of these proteins. To 
overproduce the E. coli formate hydrogen lyase system and hence the 
amount of hydrogen, (1) the formate hydrogen lyase repressor gene, hyc A, 
was inactivated and (2) the formate hydrogen lyase activator gene, fhlA, 
was overexpressed. These manipulations resulted in the large subunit of 
formate dehydrogenase (FDH-FI) and the large subunit of hydrogenase 3 
(FlycE) being overexpressed 6.5- and 7.0-fold, respectively, compared to the 
wild type. These changes resulted in a nearly threefold increase in hydrogen 
productivity compared to the wild type. Additional enhancement of the 
amount of hydrogen produced was obtained by employing the engineered 
E. coli cells under anaerobic conditions in a bioreactor at a very high density 
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(96 g [dry weight] per liter). When the formate concentration was main¬ 
tained below 25 mM, continuous hydrogen synthesis of 23.6 g of hydrogen 
per hour per liter was realized. This level of hydrogen production is suffi¬ 
cient for this system to be considered to have significant potential for com¬ 
mercial application. 


SUMMARY 


B ioremediation is the term that is applied to the use of 
microorganisms to clear the environment of contami¬ 
nating substances. Many members of the bacterial genus 
Pseudomonas carry plasmids that encode enzymes capable of 
degrading aromatic and halogenated organic compounds. In 
most cases, a single plasmid carries the genes encoding 
enzymes for a specific degradative pathway. By combining 
plasmids from different pseudomonad strains within a single 
host, it is possible to create an organism with multiple degra¬ 
dation capabilities. In addition, by genetic manipulation, the 
range of substrates degraded by a particular enzymatic 
pathway can be extended. 

Raw biological material is called biomass and is often used 
as a starting material in industrial processes. The use of milled 
grain for the production of alcohol or fructose requires a 
number of enzymatic steps. The enzymes that are used in 
these processes are often used only once and then discarded. 
To enhance enzymatic conversions and decrease costs, bacte¬ 
rial genes encoding enzymes that are thermostable, highly 
efficient catalytically, or tolerant of alcohol have been cloned, 
characterized, and tested. 


To improve the commercial production of alcohol, some 
workers have genetically transformed the bacterium Z. mobilis 
with genes that allow it to utilize a broad range of compounds 
as carbon sources. Enzymes that degrade starch can also be 
used to facilitate the ability of microorganisms, such as L. plan- 
tarum, to ferment silage. 

Often, as a consequence of processing biological material, 
large amounts of lignocellulose remain. This material gener¬ 
ally has been treated as a waste product. However, there is 
now interest in using lignocellulose as a resource for carbon- 
containing compounds, especially glucose, that can be used in 
other processes. Retrieving glucose from lignocellulose is not 
an easy matter. Lignocellulose is a complex of lignin, hemicel- 
lulose, and cellulose; without harsh and expensive pretreat¬ 
ment, it is refractory to enzymatic degradation. Recent research 
has focused on characterizing the mechanism of breakdown of 
cellulose to glucose. The genes for endoglucanases, exogluca- 
nases, and p-glucosidases from a variety of organisms have 
been cloned and characterized, but to date there has been little 
success in formulating a set of enzymes that efficiently 
degrades cellulose in vitro on a large scale. 
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REVIEW QUESTIONS 


1. How would you genetically engineer a bacterium to 
degrade trichloroethylene? 

2. Outline a protocol that you would use to clone fungal cel- 
lulase genes. 

3. Delineate the role of a-amylase and glucoamylase in the 
industrial production of alcohol. How might genetic manipu¬ 
lation of the genes encoding these enzymes be used to 
improve this process? 

4. What is glucose isomerase? Why is it important? How and 
why would you modify the gene encoding this enzyme? 

5. Elaborate some of the advantages and disadvantages of 
using Z. mobilis instead of S. cerevisiae for alcohol production. 
How would you improve the industrial performance of Z. 
mobilis ? 

6. How can Z. mobilis be engineered to produce ethanol from 
xylose and arabinose? 

7. Starting with a Pseudomonas strain that can utilize phenol as 
its sole carbon source at CPC, a Pseudomonas strain that can 
degrade anthracene to catechol at 35°C, and a Pseudomonas 
strain that can degrade p-toluene to protocatechuate at 35°C, 
suggest a strategy for developing a strain that can utilize 
phenol, anthracene, or p-toluene as its sole carbon source at 
CPC. 


8. Explain how a Pseudomonas strain that carries plasmid 
pWWO and does not normally degrade 4-ethylbenzoate can 
be genetically manipulated to hydrolyze this compound. 

9. Suggest schemes for the isolation of prokaryotic endoglu¬ 
canase and (3-glucosidase genes. 

10. What is a "superbug"? 

11. How can L. plantarum be manipulated to improve its 
ability to ferment silage? 

12. How can pesticide-degrading enzymes be expressed on 
the surface of a bacterium? 

13. How would you degrade organic environmental pollut¬ 
ants in the presence of high levels of radioactivity? 

14. How would you engineer yeast strains to more efficiently 
convert glucose into ethanol? 

15. How would you expand the substrate range of a strain of 
Burkholderia sp. that normally degrades 2,4-dinitrotoluene? 

16. How would you engineer glucoamylase to be more effi¬ 
cient in degrading starch? 

17. How would you engineer E. coli to produce isopropanol? 

18. What is a designer cellulosome? How is it produced? 

19. How would you engineer E. coli to produce hydrogen gas 
from formic acid? 
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Plant Growth-Promoting 
Bacteria 


U NDER NATURAL ENVIRONMENTAL CONDITIONS, successful plant growth 
and development and high crop yields depend on the genetic con¬ 
stitution of the crop species, suitable weather conditions, and soil 
components, including the availability of nutrients; the absence of growth- 
inhibitory substances, such as salt; the presence of certain beneficial micro¬ 
organisms; and the absence of pathogenic ones (called phytopathogens, 
from phyto, meaning plant). Some beneficial indigenous soil bacteria and 
fungi act directly by providing a plant growth-enhancing product, and 
others act indirectly The latter organisms inhibit the growth of pathogenic 
soil microorganisms, thereby preventing them from hindering plant 
growth. 

The direct promotion of plant growth usually entails providing the 
plant with a compound that is synthesized by the bacterium, such as fixed 
nitrogen or a plant hormone. Also, these bacteria can facilitate the uptake 
by the plant of certain nutrients from the environment. The indirect promo¬ 
tion of plant growth occurs when plant growth-promoting bacteria lessen 
or prevent the deleterious effects of phytopathogenic organisms, either 
fungi or bacteria, i.e., they act as biocontrol agents. This activity is called 
antibiosis, and it either depletes a scarce resource required by the pathogen 
or produces a compound that impedes the growth of the phytopathogenic 
organism. 

Direct stimulation of plant growth and development by plant growth- 
promoting bacteria can occur in several different ways. The bacteria can (1) 
fix atmospheric nitrogen to ammonia that is used by the plant; (2) synthe¬ 
size siderophores that solubilize and sequester iron from the soil and pro¬ 
vide it to plant cells; (3) synthesize phytohormones, such as auxin, 
cytokinin, or gibberellin, that enhance various stages of plant growth; (4) 
solubilize minerals, such as phosphorus, that are used by the plant; and (5) 
synthesize an enzyme that can modulate the level of the plant hormone 
ethylene. Any particular plant growth-promoting bacterium may utilize 
one or more of these mechanisms. 
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CHAPTER 15 


Much of the recent genetic research directed at creating microbial 
strains with augmented plant growth-promoting activity has focused on a 
few areas of study. 

• Engineering of better biocontrol strains of bacteria to decrease the 
damage to plants from a variety of pathogens. This work is aimed at 
replacing some of the chemical pesticides that may become environ¬ 
mental pollutants. 

• The use of bacteria to lower ethylene levels in plants. These studies 
are directed toward preventing high levels of ethylene from accu¬ 
mulating in plants and thereby decreasing the damage to the plant 
from a variety of environmental stresses, including drought, flooding, 
salt stress, and the presence of pathogens. 

• The molecular basis of nitrogen fixation. This topic has been inves¬ 
tigated thoroughly to determine whether it is possible to increase 
the level of microbial nitrogen fixation and consequently lessen the 
current dependency on chemical fertilizers for crop plants. 

• Root nodule formation by symbiotic bacteria. This process has been 
studied with the aim of producing genetically engineered bacteria 
that can outcompete naturally occurring symbiotic bacteria. 

• Microbial synthesis of iron-sequestering compounds (siderophores). 
These reactions are being characterized in the hope that it might be 
possible to produce beneficial strains that prevent the growth of 
phytopathogenic microorganisms. 

• Manipulation of plant growth-promoting bacteria to facilitate phy¬ 
toremediation (the use of plants to remediate contaminated environ¬ 
ments). 

Current research in this area mainly deals with plant growth-promoting 
bacteria rather than fungi. This is at least partly due to the fact that scien¬ 
tists have found it difficult or even impossible to grow many beneficial 
fungi in culture, so not only is it difficult to manipulate them in the labora¬ 
tory, it is also extremely difficult to obtain large enough amounts of these 
organisms for inoculation of crops. In the past, bacterial fertilization had a 
dubious reputation. During the 1950s in the Soviet Union, more than 10 
million hectares (about 39,000 square miles) of farmland were treated with 
diazotrophic (nitrogen-fixing) bacterial mixtures that consisted primarily 
of Azotobacter chroococcum and Bacillus megaterium. In these experiments, 
about 60% of the time, yields of various crops were increased by 10 to 20%. 
However, these field trials were poorly designed and not replicable, so 
many researchers were skeptical about the validity of the work and tended 
to discount the use of bacterial inoculants as fertilizing agents on a large 
scale. In recent years, considerable progress has been made toward under¬ 
standing many of the mechanisms employed by plant growth-promoting 
bacteria. Thus, there is a much greater likelihood than in the past that 
results will be predictable and reproducible. 


Growth Promotion by Free-Living Bacteria 

Plant growth-promoting bacteria include a wide range of bacteria that are 
free-living or that form a symbiotic relationship with plants, such as 
Rhizobium and Frankia. While numerous free-living soil bacteria are consid¬ 
ered to be plant growth-promoting bacteria (Table 15.1), not all bacterial 
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TABLE 15.1 Examples of successful agricultural plant growth stimulation by free-living plant growth-promoting bacteria 


Bacterium 

Plant(s) 

Conditions 

Azospirillum brasilense 

Guinea grass, millet, sorghum, bean, wheat, 
barley, fountain grass, Sudan grass, corn, 
chickpea, fava bean, oat, rice 

Field, greenhouse, hydroponic system 

Azospirillum irakense 

Winter wheat, corn 

Field 

Azospirillum lipoferum 

Millet, sunflower, corn 

Field, greenhouse 

Azospirillum sp. 

Wheat, corn, millet, mustard, rice, sorghum 

Field, greenhouse 

Azotobacter chroococcum 

Barley 

Growth chamber 

Bacillus amyloliquefaciens 

Tomato, pepper 

Field 

Bacillus cereus 

Tomato, pepper 

Field 

Bacillus polymyxa 

Wheat, sugar beet 

Field 

Bacillus pumilis 

Tomato, pepper 

Field 

Bacillus subtilis 

Tomato, pepper, peanut, onion 

Field, growth chamber 

Bacillus sp. 

Sorghum, wheat 

Field 

Burkholderia vietnamiensis 

Rice 

Field 

Enterobacter cloacae 

Tomato, pepper, mung bean 

Greenhouse 

Pseudomonas cepacia 

Winter wheat 

Field, growth chamber 

Pseudomonas chlororaphis 

Spring wheat 

Field, laboratory 

Pseudomonas fluorescens 

Winter wheat, potato, tomato, cucumber, 
blueberry 

Field, greenhouse, growth chamber 

Pseudomonas putida 

Winter wheat, potato, canola, cucumber, 
lettuce, tomato, barley, oat 

Field, greenhouse, growth chamber 

Pseudomonas syringae 

Bean 

Greenhouse 

Pseudomonas sp. 

Canola, potato, rice, lettuce, cucumber, 
tomato, corn 

Field, greenhouse, growth chamber, 
hydroponic system 


strains of a particular genus and species have identical metabolic capabili¬ 
ties. Thus, for example, some Pseudomonas putida strains actively promote 
plant growth, while others have no measurable effect on plants. 

The major applications of bacteria for improving plant growth include 
agriculture, horticulture, forestry, and environmental restoration (phytore¬ 
mediation). In the past 20 years or so, based on a better understanding of 
the mechanisms employed by these bacteria and following a large number 
of successful laboratory and field studies, an increasing number of plant 
growth-promoting bacteria have been commercialized. 

The mechanism most commonly invoked to explain the various effects 
of plant growth-promoting bacteria on plants is the production of phyto¬ 
hormones. Research in this area has focused on the role of a class of phyto¬ 
hormones called auxins. The most common and best-characterized auxin is 
indole-3-acetic acid (IAA), which stimulates in plants both rapid responses, 
such as increases in cell elongation, and long-term effects, such as increases 
in cell division and differentiation. Since both plants and plant growth- 
promoting bacteria can synthesize auxin, it is difficult for researchers to 
distinguish between plant responses that result from bacterial auxin syn¬ 
thesis and those that result from plant auxin synthesis. This uncertainty 
notwithstanding, there is considerable evidence to suggest that many plant 
growth-promoting bacteria facilitate plant growth by altering the hormonal 
balance within a plant. 

In the early 1990s, it was discovered that many plant growth- 
promoting bacteria contain an enzyme that can modulate levels of the plant 
hormone ethylene. This enzyme, 1-aminocyclopropane-l-carboxylate 




602 


CHAPTER 15 



ACC a-Ketobutyrate 


FIGURE 15.1 Cleavage of ACC to a-ketobutyrate and ammonia by ACC deaminase. 


(ACC) deaminase, cleaves ACC, which is the immediate biosynthetic pre¬ 
cursor of ethylene in plants (Fig. 15.1). As depicted in Fig. 15.2, the bacte¬ 
rium binds to seed coats or plant roots and then sequesters and cleaves 
ACC. As a result, the level of ethylene in the developing (or stressed) plant 
is lowered. In many plants, ethylene stimulates germination and breaks the 


FIGURE 15.2 Schematic representation of the mechanisms by which an ACC deami¬ 
nase-containing plant growth-promoting bacterium bound to either a seed or a 
plant root lowers the ethylene concentration and thereby prevents ethylene inhibi¬ 
tion of root elongation. The arrows indicate chemical or physical steps in the 
mechanism, and the symbol _L indicates inhibition of root elongation by ethylene. 
IAA is synthesized and secreted by a plant growth-promoting bacterium that is 
bound to the surface of either the seed or the root of a developing plant. After being 
taken up by the plant, together with the IAA from the plant, the bacterial IAA can 
stimulate either plant cell proliferation and elongation or the activity of the enzyme 
ACC synthase, which converts S-adenosylmethionine (AdoMet) to ACC. A signifi¬ 
cant portion of the ACC is exuded from plant roots or seeds, along with other small 
molecules normally present in seed or root exudates; taken up by the bacterium; 
and hydrolyzed by the enzyme ACC deaminase to ammonia and a-ketobutyrate 
(a-KB). This uptake and cleavage of ACC decreases the amount of ACC outside the 
plant. To maintain the equilibrium between internal and external ACC, the plant 
exudes more ACC. Consequently, the concentration of ACC, and therefore ethylene, 
in the plant is lowered. Adapted from Click et al., /. Theor. Biol. 190:63-68,1998. 
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dormancy of the seeds; however, if the level of ethylene remains high after 
germination, a problem that is especially acute when plants are under 
stress, root elongation is inhibited. Thus, the ACC deaminase that is pro¬ 
vided by a plant growth-promoting bacterium prevents the inhibition of 
root elongation (Fig. 15.3), and consequently, the plant produces longer 
roots during early development, resulting in a healthier and larger plant. In 
addition, many plant growth-promoting bacteria synthesize IAA. The IAA 
that is produced by the bacterium and taken up by the plant that is not 
used to promote plant cell elongation or proliferation stimulates the tran¬ 
scription of the enzyme ACC synthase in the plant. A greater amount of 
ACC synthase causes an increase in the level of ACC, eventually resulting 
in an increase in the ethylene concentration. When ACC deaminase activity 
is present, it prevents the buildup of ACC, even in the presence of high 
levels of IAA, so that the ethylene level does not become elevated to the 
point where plant growth is impaired. 

In general, nitrogen fixation by free-living plant growth-promoting 
bacteria probably makes only a minor contribution to the growth of a plant. 
In fact, not all plant growth-promoting bacteria are diazotrophic, and many 
of those that are diazotrophic fix only limited amounts of nitrogen. 

A number of plants use bacterial iron-siderophore complexes to obtain 
iron from the soil. Without this mechanism, plant growth in many soils 
would be severely limited, as iron is an essential plant nutrient. However, 
while bacterial siderophores undoubtedly contribute to the nutrition, and 
hence to the growth, of plants, in many instances this effect is small. 

There is some controversy regarding the mechanism that plant growth- 
promoting bacteria use to facilitate the uptake of minerals such as phos¬ 
phorus by a plant. On one hand, the increased mineral uptake in plants 
treated with plant growth-promoting bacteria may reflect a better-developed 
root system and an overall healthier plant. On the other hand, experiments 
with Azospirillum have shown that this organism enhances mineral uptake 
by secreting organic acids that can solubilize and bind some minerals. 


FIGURE 15.3 Effect of treating canola seeds with ACC deaminase-containing plant 
growth-promoting bacteria on root ACC content (A) and root length (B), following 
growth of the plant for 4.5 days after the seeds were sown. The seeds were treated 
with MgS0 4 as a control, the ACC deaminase-containing bacterium P. putida 
GR12-2, or the chemical ethylene inhibitor 1-aminovinylglycine (AVG). The error 
bars indicate standard errors. 
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MILESTONE 


A Model for the Lowering of Plant Ethylene 
Concentrations by Plant Growth-Promoting Bacteria 

B. R. Click, D. M. Penrose, and J. Li 
J. Theor. Biol. 190:63-68,1998 


I n addition to the profound influ¬ 
ence that ethylene has on normal 
plant growth and development, it 
is a stress hormone whose synthesis is 
increased when a plant is subjected to 
any one of a variety of environmental 
stresses. These stresses include 
mechanical trauma, pathogen infec¬ 
tion, extremes of temperature, 
drought, flooding, salt, and the pres¬ 
ence of environmental contaminants. 
Following periods of stress, the eth¬ 
ylene that is produced by the plant 
often exacerbates the effects of the 
stress. This can lead to plant senes¬ 
cence or death. Any chemical or bio¬ 
logical treatment that lowers the 
amount of ethylene that is produced 
by a plant as a consequence of an 
environmental stress should therefore 
also decrease some of the resulting 
damage to the plant. 

It has been known for many years 
that certain strains of bacteria can pro¬ 
mote the growth of plants. At the time 


that this article was published, many 
of the mechanisms involved in the 
promotion of plant growth by bacteria 
had apparently been elucidated. 
However, there did not seem to be any 
one mechanism that could reliably 
and reproducibly promote the growth 
of a wide variety of plants under a 
range of different conditions. This was 
in spite of the fact that some strains of 
Rhizobium, some biocontrol strains, 
and some Azospirillum strains had 
been commercialized, albeit to a lim¬ 
ited extent. 

In this article. Click and coworkers 
developed a conceptual framework to 
explain a number of empirical obser¬ 
vations that their laboratory had 
reported beginning in 1994. In its sim¬ 
plest terms, the model that they elabo¬ 
rated suggested that some plant 
growth-promoting bacteria that were 
bound to plant tissues could act as a 
sink for some of the ACC that was 
produced by the plant in response to 


various types of stress. Since ACC is 
the immediate precursor of ethylene 
in all higher plants, the model pre¬ 
dicted that lowering ACC levels 
before it could be converted to eth¬ 
ylene would limit some of the delete¬ 
rious effects of a particular stressor on 
a plant. The model was tested in a 
growth chamber, then in a green¬ 
house, and eventually in field experi¬ 
ments. It was found that plant 
growth-promoting bacteria that con¬ 
tained active ACC deaminase could 
significantly decrease the inhibition of 
growth and damage to plants fol¬ 
lowing exposure to either high salt 
levels, the presence of metals or 
organic contaminants, phytopatho¬ 
gens, flooding, or drought. Moreover, 
workers in many different laboratories 
around the world have found that this 
approach works well with a wide 
range of plants, including canola, 
tomato, lettuce, soybean, mung bean, 
Indian mustard, various grasses, 
wheat, pea, corn, and cotton. Thus, by 
either selecting or engineering plant 
growth-promoting bacteria to express 
ACC deaminase, the productivity of a 
range of crop plants can be improved 
dramatically. 


As a better understanding of the mechanisms used by plant growth- 
promoting bacteria emerges, it will become possible to genetically engineer 
improved organisms that can stimulate the growth of a wide range of 
plants in a variety of environments. 

Decreasing Plant Stress 

In addition to its effect on seed germination and root elongation, ethylene 
mediates a wide range of plant responses and developmental steps. 
Ethylene is involved in tissue differentiation, formation of root and shoot 
primordia, lateral bud development, flowering initiation, anthocyanin syn¬ 
thesis, flower opening and senescence, fruit ripening and degreening, pro¬ 
duction of volatile organic compounds that are responsible for aroma 
formation in fruits, storage product hydrolysis, leaf and fruit abscission, 
and the response of plants to biotic and abiotic stress. In some processes, 
ethylene is stimulatory, while in others it is inhibitory. 

The term "stress ethylene" describes the increase in ethylene biosyn¬ 
thesis associated with biological and environmental stresses and pathogen 
attack. The increased level of ethylene formed in response to trauma 
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inflicted by chemicals, temperature extremes, water stress, ultraviolet light, 
insect damage, disease, and mechanical wounding can be both the cause of 
some of the symptoms of stress (e.g., onset of wilting and increased senes¬ 
cence) and the inducer of responses that will enhance the survival of the 
plant under adverse conditions. Often, a small burst of ethylene is synthe¬ 
sized by plants within a few hours after an environmental stress. This low 
level of ethylene acts as a trigger to initiate the biosynthesis of a number of 
plant defense proteins. Subsequently, some 2 to 4 days after the onset of the 
stress, the plant produces a much larger burst of ethylene. It is this second 
peak of ethylene synthesis that is responsible for attenuating the delete¬ 
rious effect(s) of the stress (Fig. 15.4). 

While chemicals have been successfully used to control ethylene levels 
in plants, many of them are either expensive or potentially harmful to the 
environment. Consequently, ACC deaminase-containing plant growth-pro¬ 
moting bacteria have been tested to determine whether they could be used 
as an environmentally safe method for lowering plant ethylene levels. 

Flooding is a common abiotic stress that affects many plants, often 
several times during the same growing season. Plant roots suffer a lack of 
oxygen as a consequence of flooding; this, in turn, causes deleterious 
effects, such as wilting (epinasty), inhibition of leaf chlorophyll synthesis 
(chlorosis), cell death (necrosis), and reduced fruit yield. Many plants 
respond to flooding by activating the transcription, in root cells, of some of 
the genes that code for isozymes of ACC synthase, the enzyme that con¬ 
verts the compound S-adenosylmethionine into ACC. This eventually 
results in an increase in the amount of ACC inside plant roots. However, 
since ACC oxidase cannot catalyze ethylene synthesis in the absence of 
oxygen, ACC is transported from the anaerobic environment of flooded 
roots into the aerobic shoots, where it is converted to ethylene (Fig. 15.5). 
The ethylene in the shoots causes plants to wilt, to lose biomass, and even¬ 
tually (if the ethylene remains elevated for a prolonged time) to senesce 
and die. Treatment of tomato plants with ACC deaminase-containing plant 
growth-promoting bacteria significantly decreases the damage suffered by 
these plants due to stress ethylene brought on as a consequence of flooding 
(Fig. 15.6). These ACC deaminase-containing plant growth-promoting bac¬ 
teria can act as a sink for ACC, lowering the level of ethylene that can be 
formed in the shoots and thereby protecting the tomato plants from a por¬ 
tion of the damage caused by flooding. 

In addition to protecting plants from flooding damage, ACC deami¬ 
nase-containing plant growth-promoting bacteria can also significantly 
decrease the damage to plants that is caused by drought, temperature 
extremes, high concentrations of salt, and a variety of environmental con¬ 
taminants. For example, in greenhouse experiments, tomato plants treated 
with Achromobacter piechaudii ARV8, which contains ACC deaminase, were 
able to grow better in the presence of 86 mM salt (which is usually inhibi¬ 
tory to plant growth) than were tomato plants grown without the added 
bacterium in either the presence or the absence of salt (Fig. 15.7). This bac¬ 
terium, which was isolated from a soil sample from the Arava region of the 
Negev desert in Israel, significantly lowered the level of stress ethylene 
produced by tomato plants in the presence of salt. More recently, several 
groups have shown that this approach can facilitate the growth of a range 
of crops in saline soils in the field, a problem that is endemic to about 25% 
of the world's arable land. 



Time -► 


FIGURE 15.4 Time course of plant eth¬ 
ylene synthesis following environ¬ 
mental stress or fungal pathogen 
infection. 


FIGURE 15.5 Schematic representation of 
a flooded potted plant, where the ACC 
that is produced in the roots as a conse¬ 
quence of the stress is unable to be 
converted to ethylene because of the 
absence of oxygen. The ACC is subse¬ 
quently transported to the shoots, 
where oxygen is plentiful, and con¬ 
verted to ethylene, causing epinasty 
and loss of biomass. 
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FIGURE 15.6 Effect of ACC deaminase-containing plant growth- 
promoting bacteria on flooded tomato plants. (A and B) Fifty- 
five-day-old tomato plants were either grown for an additional 
9 days (A) or flooded for 9 days before the dry weight of the 
leaves and shoots was determined (B). (C) The amount of eth¬ 
ylene produced by 55-day-old tomato plant leaf stems (peti¬ 
oles) following 9 days of flooding was also measured. ACC 
deaminase activity is present in Enterobacter cloacae CAL2 and 
P. putida UW4, and in another P. putida strain transformed with 
the plasmid pRK415ACC, which carries a bacterial ACC deam¬ 
inase gene. P. putida strain pRK415 has no ACC deaminase 
activity. The error bars indicate standard errors; the standard 
errors in panel C were negligible. Nonflooded plants produce 
approximately 0.07 pmol of ethylene g _1 s _1 , regardless of the 
presence or absence of bacteria. Adapted from Grichko and 
Click, Plant Physiol. Biochem. 39:11-17, 2001. 
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Increasing Phosphorus Availability 

While a number of plant growth-promoting bacteria can synthesize and 
secrete organic acids that can dissolve inorganic phosphate in the environ¬ 
ment, these organisms can rarely break down phytate, the complex com¬ 
pound (inositol hexaphosphate) that is the major chemical form of 
phosphorus within cereal grains and oilseeds. Several plants can produce 
phytases, enzymes that degrade phytate. However, the activity of these 
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Treatment 


FIGURE 15.7 (A) The ACC deaminase-containing bacterium A. piechaudii ARV8 
(green) increases the tolerance of tomato plants for salt compared to plants grown 
without the bacterium (red). (B) Treatment of tomato plants with salt causes an 
increase in the synthesis of stress ethylene. Ethylene production in the presence of 
salt is partially inhibited by A. piechaudii ARV8. The error bars indicate standard 
errors. 


enzymes in plant roots is generally low, so these plants cannot efficiently 
utilize the phytate that is found in the soil. A gene encoding the enzyme 
phytase was isolated from the fungus Aspergillus fumigatus. This gene was 
inserted, using a transposon, into the chromosomal DNA of a strain of the 
bacterium Bacillus mucilaginosus, which can dissolve phosphorus from cal¬ 
cium phosphate. The transformed bacterial strain can express and secrete 
active phytase. In greenhouse experiments, this transformed bacterium 
was found to be superior to the wild-type strain in providing phosphorus 
to tobacco plants cultivated in its presence (Table 15.2). Importantly, the 
stimulation of plant growth that was observed in greenhouse experiments 
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TABLE 15.2 Growth of tobacco plants for 90 days in pots in the greenhouse with different 
treatments 


Characteristic 

Soil with no added 
bacteria 

Soil plus wild-type 
bacterium 

Soil plus transformed 
bacterium 

Plant height (cm) 

17.4 

21.4 

24.7 

Plant dry weight (mg) 

1,297 

1,685 

1,870 

Leaf P content (gg/g) 

710 

732 

800 


was also evident in the field, where the yield of tobacco plants increased by 
19%. Notwithstanding these results, it may be some time before this work 
is commercialized because of political concerns about the use of genetically 
engineered bacteria in the environment. 


FIGURE 15.8 A six-coordinate iron- 
siderophore complex. Three bidentate 
functional groups on a siderophore 
molecule bind with ferric iron. 


\ / 

C —C 


o o 

\ / 

\ / 

\ / 

\ / 

\ / 


Fe 


, 3 + _ 


-O' 


"O. 


c 


N' 


.o 


o. 


Biocontrol of Pathogens 

Phytopathogens are an ongoing and serious agricultural problem that can 
reduce crop yields by 25 to 100%. This is an enormous loss of productivity. 
Currently, phytopathogen damage to crops is generally dealt with by the 
use of chemical agents, although other treatments have also been employed. 
For most bacterial diseases, plants may be symptomless for prolonged 
periods before changes in environmental conditions which favor the prolif¬ 
eration of the bacteria cause a rapid outbreak of disease. Under these condi¬ 
tions, severe damage can occur and destroy an entire crop. These field 
epidemics are difficult and costly to control. 

Many of the chemicals that are used to control phytopathogens are 
hazardous to animals and humans, and they persist and accumulate in 
natural ecosystems. It is therefore desirable to replace these chemical 
agents with biological control agents that are more "friendly" to the envi¬ 
ronment. One approach for the control of phytopathogens is the develop¬ 
ment of transgenic plants that are resistant to one or more of them (see 
chapter 18). Alternatively, some plant growth-promoting bacteria can act as 
biocontrol agents to suppress or prevent phytopathogen damage, and a 
number of these biocontrol bacteria have been commercialized (Table 15.3). 
Plant growth-promoting bacteria can produce a variety of substances that 
limit damage to plants by phytopathogens. They include siderophores, 
antibiotics, other small molecules, and a variety of enzymes. This approach 
is still at an early stage of development but appears to have considerable 
potential. However, the ultimate utility of a strategy based on a particular 
mechanism can be assessed only under field conditions. 

Siderophores 

Iron is one of the most abundant minerals on Earth and is an essential 
requirement for living organisms. However, iron in the soil is unavailable 
for direct assimilation by microorganisms because ferric iron, or Fe(III), 
which is the predominant form in nature, is only sparingly soluble, i.e., its 
solubility is about 10 18 M at pH 7.4. This amount of soluble iron is much 
too small to support microbial growth. Consequently, to survive in this 
environment, soil microorganisms synthesize and secrete low-molecular- 
mass (-400- to 1,000-dalton) iron-binding molecules known as sidero¬ 
phores (Fig. 15.8). Siderophores bind Fe(III) with a very high affinity 
(dissociation constant [KJ = 10 20 to 10 50 M) and transport it back to cell 
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TABLE 15.3 Some commercially available biocontrol plant growth-promoting bacteria 


Bacterium 

Pathogen or disease 

Crop(s) 

Agrobacterium mdiobacter 

Crown gall disease caused by A. 
tumefaciens 

Fruit trees, nut trees, and ornamental 
nursery stock 

Azotobacter brasilense 

Root rot and damping-off 

Turf, forage crops, corn 

Bacillus subtilis 

Rhizoctonia solani, Pythium spp., Fusarium 
spp., Alternaria spp., and Aspergillus spp. 
that attack roots; also various seedling 
pathogens 

Cotton, legumes, barley, tomato, rice 

Bacillus amyloliquefaciens 

Fusarium spp., Rhizoctonia spp. 

Herbs, spices, vegetables, tree seedlings, 
ornamentals 

Bacillus pumilis 

Powdery mildew, downy mildew, 

Fusarium spp., Phytophthora spp., 
Rhizoctonia spp., Sclerotinia spp. 

Fruits and vegetables, oak, maple, stored 
seeds 

Burkholderia cepacia 

Fusarium spp., Pythium spp., nematodes 

Vegetables 

Burkholderia cepacia 

Rhizoctonia spp., Fusarium spp., Pythium 
spp., nematodes 

Alfalfa, barley, beans, clover, corn, cotton, 
peas, grain sorghum, vegetables, 
wheat 

Paenibacillus polymyxa 

Damping-off, powdery mildew 

Cucumber 

Pseudomonas chlororaphis 

Fusarium, leaf stripe, leaf spot, net blotch, 
spot blotch 

Barley, oat 

Pseudomonas fl uorescens 

Frost, Erwinia amylovora, Pseudomonas 
tolassii 

Almond, apple, cherry, mushrooms, 
peach, pear, potato, strawberry, tomato 

Pseudomonas syringae 

Botrytis cinerea, Penicillium spp., Mucor 
piriformis, Geotrichum candidum 

Citrus and pome fruit, potatoes 

Streptomyces griseoviridis 

Fusarium spp., Alternaria brassicola, 

Phomopsis spp., Botrytis spp., Pythium 
spp., and Phytophthora spp. 

Field, ornamental, and vegetable crops 

Streptomyces lydicus 

Control of root rot and damping-off 
caused by Fusarium, Rhizoctonia, 

Pythium, Phytophthora, Sclerotinia, Postia, 
and Verticillium; also suppresses foliar 
diseases caused by Botrytis 

Useful for protection of cuttings of a 
variety of plants; also used with 
turfgrass 

Mixture of B. subtilis, P. polymyxa, 
Bacillus circulans, and 

B. amyloliquefaciens 

A range of fungal damping-off diseases 

Especially useful in hydroponic gardens 


surface receptors, where it is taken into the cell. Once inside a cell, the iron 
is released and is then available to support microbial growth. 

Plant growth-promoting bacteria can prevent the proliferation of 
fungal phytopathogens by producing siderophores that bind most of the 
Fe(III) in the area around the plant root (the rhizosphere). The resulting 
lack of iron prevents fungal pathogens from proliferating in the immediate 
vicinity. Fungal phytopathogens also synthesize siderophores, but these 
generally have a much lower affinity for iron than do the siderophores 
produced by plant growth-promoting bacteria. In effect, the plant growth- 
promoting bacteria outcompete fungal phytopathogens for available 
iron. 

Unlike microbial phytopathogens, plants are not generally harmed by 
the localized depletion of iron in the soil caused by plant growth-promoting 
bacteria. Most plants can grow at much lower iron concentrations than 
microorganisms. In addition, some studies have shown that iron that has 
been sequestered by bacterial siderophores is taken up from the soil by the 
plant, to its benefit. 
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FIGURE 15.9 Iron-binding groups of microbial siderophores. 


Because the sequestering of iron by a siderophore-producing bacte¬ 
rium can simultaneously prevent the proliferation of a number of different 
phytopathogenic microorganisms, siderophore genes are being examined 
to determine whether they can be used to create more effective biocontrol 
inoculants. 

Siderophores generally have three functional, or iron-binding, groups 
connected by a flexible backbone, often a peptide. Each functional group 
usually presents two atoms of oxygen or, less commonly, two nitrogen 
atoms, that bind iron. In chemical terms, the functional groups are biden- 
tate. Trivalent ferric iron can accommodate three of these groups to form a 
six-coordinate complex (Fig. 15.8). With some exceptions, the functional 
groups on microbial siderophores are either hydroxamates or catecholates 
(Fig. 15.9). Different combinations of functional groups may be present on 
a single siderophore. Other functional groups include carboxylate moieties, 
such as citrate, and ethylenediamine (Fig. 15.9). In general, hydroxamate- 
type siderophores are typical of fungi, and catecholates, which bind iron 
more tightly than hydroxamates, are common in bacterial siderophores. 
Plant siderophores, on the other hand, are linear hydroxy- and amino- 
substituted iminocarboxylic acids, such as mugineic acid and avenic acid. 

One bacterial siderophore, called pseudobactin (Fig. 15.10), has been 
estimated to bind to Fe(III) with an affinity constant of approximately 10 25 
liters mob 1 . All fluorescent pseudomonads, so named because they pro¬ 
duce a siderophore that fluoresces when excited by ultraviolet light, syn- 


FIGURE 15.10 Structure of the siderophore called pseudobactin from Pseudomonas sp. 
strain BIO. One molecule of Fe(III) is bound to the siderophore. 
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thesize structurally related siderophores that differ mainly in the number 
and configuration of the amino acids in the peptide chain that makes up 
the backbone. 

The synthesis and regulation of pseudobactin in the plant growth- 
promoting bacterium P. putida WCS358 has been examined in detail. 
Mutagenesis was used to generate a set of 28 mutants that were defective 
for siderophore production. Two criteria were used for identifying the 
siderophore-deficient mutants: (1) lack of fluorescence under ultraviolet 
light and (2) inability to grow in the presence of bipyridyl, a molecule that 
sequesters most of the iron in the growth medium. When most of the iron 
is unavailable, only a cell that produces siderophores can grow. 

A clone bank of P. putida WCS358 DNA was constructed in the broad- 
host-range cosmid vector pLAFRl and was introduced by conjugation into 
each of the 28 siderophore mutants (Fig. 15.11). All of the resultant transfor¬ 
mants were tested by complementation for restoration of fluorescence 
and/or the ability to grow in the presence of bipyridyl. Thirteen separate 
complementing cosmid clones, with an average insert size of 26 kilobase 
pairs (kb), were identified. After detailed analyses, these clones were found 
to represent at least five separate gene clusters. 

One of these gene clusters has been studied further. It has a length of 
33.5 kb and contains five transcriptional units with at least seven separate 
genes. Thus, like nitrogen fixation and nodulation, siderophore biosynthesis 


FIGURE 15.11 Cloning genes involved in siderophore biosynthesis. The clone bank is 
constructed using the broad-host-range cosmid pLAFRl. The cells that have muta¬ 
tions in one of the genes involved in siderophore biosynthesis are unable to grow 
on medium containing bipyridyl, which sequesters all of the free iron in the 
medium. Cells with mutations in genes involved in siderophore biosynthesis are 
selected from the replica plate that does not contain bipyridyl. Transformants that 
can grow in the presence of bipyridyl are able to complement the mutation in one 
of the siderophore biosynthesis genes. 
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TABLE 15.4 Effect of additional copies of the rpoD gene that encodes p 70 from 
P. fluorescens CHAO on the ability of the bacterium to prevent damage to cucumber 
roots caused by the pathogenic fungus P. ultimum 


Plant growth-promoting bacterium added 

Average root fresh weight (mg) 
Without With 

P. ultimum P. ultimum 

None 

382 

44 

P. fluorescens CHAO 

386 

177 

P. fluorescens CHAO with vector 

365 

146 

P. fluorescens CHAO with vector and rpoD gene 

371 

335 


Adapted from Schnider et al., J. Bacteriol. 177:5387-5392,1995. 

In the absence of the plant growth-promoting bacterium P. fluorescens CHAO, the pathogen P. ultimum 
dramatically inhibits root growth. The presence of the rpoD gene on the plasmid vector enhanced the 
activity of the plant growth-promoting bacterium. Plants were grown for 2 weeks before their roots were 
measured. 


is a complex process. Since each siderophore is encoded by a number of dif¬ 
ferent genes, genetically engineering bacteria to produce modified sidero- 
phores is not a simple matter. However, there may be other ways to improve 
the effectiveness of plant growth-promoting bacteria as biocontrol agents. 
For example, it may be possible to extend the range of iron-siderophore 
complexes that one bacterial strain can utilize so that a genetically altered 
plant growth-promoting biocontrol bacterial strain could take up and use 
siderophores synthesized by other soil microorganisms, thereby giving it a 
competitive advantage. This was done by cloning the genes for iron-sidero¬ 
phore receptors from one plant growth-promoting control bacterium and 
introducing them into other strains. 

Antibiotics 

One of the most effective mechanisms by which a plant growth-promoting 
bacterium can prevent phytopathogen proliferation is the synthesis of anti¬ 
biotics. For example, the antibiotics synthesized by biocontrol pseudomonads 
include agrocin 84, agrocin 434, 2,4-diacetylphloroglucinol, herbicolin, 
oomycin, phenazines, pyoluteorin, and pyrrolnitrin. 

The biocontrol activity of a plant growth-promoting bacterium may be 
improved by providing it with genes that encode the biosynthesis of anti¬ 
biotics that are normally produced by other bacteria. In this way, the range 
of phytopathogens that a single biocontrol bacterium can suppress can be 
extended. Moreover, by limiting the growth of other soil microorganisms, 
antibiotic-secreting plant growth-promoting bacteria should facilitate their 
own proliferation, since they will have fewer competitors for limited nutri¬ 
tional resources. In addition, genetic manipulation can be used to increase 
the amount of antibiotic that a bacterium synthesizes. 

The production of a number of antifungal metabolites that are pro¬ 
duced by pseudomonads appears to be controlled by a protein that acts as 
a global transcriptional regulator; therefore, it should be possible to 
enhance antibiotic production by modifying this global regulation. For 
example, antibiotic production was enhanced after Pseudomonas fluorescens 
CHAO was transformed with a vector carrying the gene encoding the 
housekeeping RNA polymerase sigma-70 (a 70 ). The modified strain was 
more effective at protecting cucumber plants against a root disease caused 
by the fungus Pythium ultimum (Table 15.4). 
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A single copy of the operon carrying all seven of the genes that encode 
the biosynthesis of the antibiotic phenazine-l-carboxylic acid (i.e., phzAB- 
CDEFG ) was inserted into the chromosomal DNA of a plant growth-pro¬ 
moting bacterial strain of P. fluorescens (Fig. 15.12). The wild-type version of 
this bacterium, which does not synthesize phenazine-l-carboxylic acid, 
acts as a biocontrol agent against some fungal diseases. As indicated by a 
much larger zone of clearance of the fungal pathogen P. ultimum on solid 
medium, the engineered bacterium has a higher level of biocontrol activity 
than the wild type (Fig. 15.13). Also, the phenazine-l-carboxylic acid-pro¬ 
ducing bacterium prevented P. ultimum-caused damping-off disease in pea 
plants in soil. This work demonstrates the efficacy of this approach under 
greenhouse conditions; however, it remains to be demonstrated whether 
this altered bacterium is effective in the field. 


FIGURE 15.12 Chromosomal insertion of the antibiotic phenazine-l-carboxylic acid 
operon (phz) into a biocontrol strain of P. fluorescens. The regulatory genes that nor¬ 
mally control the expression of the seven biosynthetic genes were removed, and the 
entire operon was placed under the control of the tac promoter (p tac ). Since P. fluore¬ 
scens does not utilize lactose as a carbon source, it does not encode the lac repressor, 
and in the absence of the lac repressor, any genes under the control of the tac pro¬ 
moter are expressed constitutively. The operon, under the control of the tac pro¬ 
moter, was inserted into a derivative of transposon Tn5 adjacent to a kanamycin 
resistance gene (not shown) on a plasmid. Tn5 facilitates integration of DNA into 
the chromosome of the host cell. Transconjugants in which the chromosomal inser¬ 
tion had not inactivated any important bacterial functions were tested for their 
effectiveness as biocontrol strains. The Tn5 derivative is designed so that it does not 
easily pass from the biocontrol strain to other bacteria in the environment. 
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FIGURE 15.13 Effect of transforming a 
strain of P. fluorescens with genes 
expressing the antibiotic phenazine-1- 
carboxylic acid biosynthetic pathway 
on the ability of the bacterium to pre¬ 
vent the growth of the fungal pathogen 
P. ultimum on solid medium. The anti¬ 
fungal activity of the bacterium is pro¬ 
portional to the area of the zone of 
clearance around the center of the petri 
plate to which the bacterium is added. 


At present, there is still only one commercially available genetically 
engineered biocontrol bacterial strain. A modified version of Agrobacterium 
radiobacter K84 has been marketed, first in Australia in 1989, and more 
recently all over the world, as a means of controlling crown gall disease, 
which is caused by the bacterium Agrobacterium tumefaciens. This disease 
affects almond trees and stone fruit trees, such as peach trees. The antibiotic 
agrocin 84, which is produced by A. radiobacter, is toxic to A. tumefaciens. 
However, agrocin 84-resistant strains of A. tumefaciens can develop if the 
plasmid carrying the genes for the biosynthesis of agrocin 84 is accidentally 
transferred from A. radiobacter. To avoid this possibility, the region of DNA 
responsible for plasmid transfer was removed from the agrocin 84 plasmid, 
pAgK84 (Fig. 15.14). As a result of this deletion, the A. radiobacter strain 
retains the capacity to act as a biocontrol agent, but it can no longer transfer 
the plasmid to pathogenic agrobacteria. 

Enzymes 

Some plant growth-promoting bacteria produce enzymes, such as chi- 
tinase, (3-1,3-glucanase, protease, and lipase, that can degrade fungal cell 
walls and cause the fungal cells to lyse (Fig. 15.15). In one study, the inci¬ 
dence of plant disease caused by the phytopathogenic fungi Rhizoctonia 
solani, Sclerotium rolfsii, and P. ultimum was reduced by using a (3-1,3- 
glucanase-producing strain of Burkholderia cepacia. In another study, the 
antifungal activities of three different strains of the plant growth-promoting 
bacterium Enterobacter agglomerans were attributed to a complex of four 
separate polypeptides that act together to degrade the chitin in fungal cell 
walls. When tested, these bacteria significantly decreased the damage to 
cotton plants following infection with R. solani. Moreover, Tn5 mutants of 
E. agglomerans that were deficient in chitinase activity were unable to pro¬ 
tect plants against damage caused by the fungal pathogen, indicating that 
the chitinase was the active element. 

Many of the bacterial enzymes that can lyse fungal cells, including 
chitinases and (3-glucanases, are encoded by a single gene. It should there¬ 
fore be straightforward to isolate these genes and transfer them to plant 
growth-promoting bacteria to construct strains that produce fungus¬ 
degrading enzymes. In one series of experiments, a chitinase gene was 
isolated from the bacterium Serratia marcescens and then transferred into 
Trichoderma harzianum and Rhizobium meliloti cells. In both cases, the trans¬ 
formed microorganisms produced chitinase and displayed increased anti¬ 
fungal activity. When the S. marcescens chitinase gene was introduced into 
a strain of P. fluorescens that directly promotes plant growth, the transfor¬ 
mant also stably expressed and secreted active chitinase and effectively 
controlled the phytopathogen R. solani. 

Ice Nucleation and Antifreeze Proteins 

One of the ways in which some pathogenic leaf bacteria, such as 
Pseudomonas syringae, damage plants is by synthesizing ice nucleation pro¬ 
teins. These proteins, which are produced at low temperatures, are present 
on the surface of the bacterium and act as sites that facilitate the formation 
of ice crystals at freezing temperatures. As the ice crystals grow, they can 
pierce the plant cells and cause irreparable damage. The bacteria benefit 
from this damage by gaining direct access to the nutrients from the lysed 
plant cells. In the absence of ice nucleation proteins on the leaf surface, a 




FIGURE 15.14 Construction of a transfer-deficient (Tra~) derivative of plasmid 
pAgK84 from A. radiobacter, which encodes both synthesis of and immunity to the 
antibiotic agrocin 84. Based on knowledge of the restriction enzyme map of 
pAgK84, a DNA fragment containing the transfer (Tra) region, together with some 
of the flanking DNA, is isolated (1) and spliced into an E. coli plasmid vector (2). By 
restriction enzyme digestion, approximately 80% of the Tra region and some of the 
flanking DNA (a total of about 6 kb) is deleted from the cloned DNA containing the 
Tra sequence (3). Homologous recombination of the E. coli plasmid containing the 
deleted Tra region with plasmid pAgK84, in which transposon Tn5, which carries a 
kanamycin resistance gene, has been inserted into the Tra region, is performed (4). 
This results in some derivatives of pAgK84 in which a portion of the Tra region of 
the plasmid has been deleted (5). The resultant Tra~ mutant of pAgK84 can no 
longer be conjugationally transferred to other agrobacteria, although it is still able 
to synthesize and provide immunity to agrocin 84. None of the DNA fragments is 
shown to scale. 
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FIGURE 15.16 Strategy for the isolation of 
a Pseudomonas antifreeze protein gene. 
PCR primers based on the partial amino 
acid sequence of the isolated protein 
were used to amplify a portion of the 
antifreeze protein gene. Inverse PCR 
primers were designed based on the 
DNA sequence of the amplified gene 
fragment. 
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FIGURE 15.15 Schematic representation of a fungal cell wall being degraded by one 
or more enzymes produced by a biocontrol plant growth-promoting bacterium. 
Following the breakdown of the fungal cell wall, the cell is readily lysed. 


brief overnight frost would not damage the plant because the water in a 
plant cell must usually be several degrees below the freezing point before 
ice crystals begin to form (i.e., it must be supercooled). One way to prevent 
freezing damage caused by P. s yringae to susceptible crops, such as straw¬ 
berries, is by spraying the plants, prior to the frost, with a mutant form of 
the ice-nucleating bacterium. Such a mutant, which may be constructed by 
either recombinant DNA manipulation or conventional mutagenesis and 
selection, lacks the ability to produce the ice nucleation protein, and there¬ 
fore, ice crystals are not formed on the leaf surface. If a sufficient number 
of these "ice-minus" mutant bacteria are sprayed onto a susceptible plant, 
the mutant will displace the wild-type (ice-plus) bacteria, thereby pre¬ 
venting ice nucleation. 

An important facet of the effectiveness of a biocontrol plant growth- 
promoting bacterium is its ability to persist and proliferate in the natural 
environment. In areas such as Canada, Scandinavia, Russia, and the 
northern United States, these organisms must survive long, cold winters 
and then grow at cool soil temperatures in the spring (~5 to 10°C). While 
microorganisms have a variety of adaptive strategies for thriving under 
adverse conditions, it may be possible to engineer organisms that are better 
able to deal with cold temperatures. Some soil bacteria that are also able to 
promote plant growth can both grow at 5°C and secrete antifreeze proteins 
into the surrounding medium when grown at low temperatures. A bacte¬ 
rial antifreeze protein regulates the formation of ice crystals outside the 
bacterium and thereby provides protection for the bacterium in the soil. In 
the presence of bacterial antifreeze protein, ice crystals still form, but their 
size is limited. In the absence of antifreeze proteins, ice crystals can grow to 
a large size and eventually puncture the bacterial cell wall and membrane, 
causing cell lysis. Ice crystals do not form inside the bacterium to any great 
extent. This is because at low temperatures bacteria decrease their volume 
by pumping some of their water from inside to outside the cell. 

Recently, the gene for a bacterial antifreeze protein was isolated and 
characterized (Fig. 15.16). The strategy that was used to isolate this gene 
included purifying the protein, digesting it into small peptides with the 
proteolytic enzyme trypsin, determining the amino acid sequences of sev¬ 
eral of these peptides, and using those amino acid sequences to design 
polymerase chain reaction (PCR) primers for a portion of the antifreeze 
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Cloning of Rhizobium meliloti Nodulation Genes by 
Direct Complementation of Nocb Mutants 

S. R. Long, W. J. Buikema, and R M. Ausubel 
Nature 298:485-488, 1982 


T he isolation of a gene, when its 
basic features are unknown and 
there is no heterologous probe 
or antibody available, can be a 
daunting task. In these instances, it is 
often necessary to devise an innova¬ 
tive selection scheme. The selection 
scheme may be based on immunolog¬ 
ical detection of the target protein, 
determination of the activity of the 
target protein, DNA hybridization 
with an oligonucleotide probe whose 
sequence is deduced from the partial 
amino acid sequence of the purified 
target protein, or mutant complemen¬ 
tation. Often, after a gene that encodes 
a particular function has been isolated, 
similar genes from other organisms 
can be isolated by using the first gene 
as a heterologous DNA hybridization 
probe. This approach depends on 
nucleotide sequence similarity 
between the probe and the target 
gene. This strategy generally works 
well for genes that are evolutionarily 
conserved, such as those that encode 
proteins involved in the process of 


nitrogen fixation, but it works poorly 
for many other genes, including those 
that encode proteins that degrade cel¬ 
lulose. 

When Long and her coworkers set 
out to isolate and characterize the 
nodulation genes from R. meliloti, nod¬ 
ulation genes had never been isolated. 
Very little was known about how 
many genes were involved in this pro¬ 
cess or what products they might 
encode. As a starting point, these 
workers isolated and characterized 
several nodulation-defective mutants 
of R. meliloti. However, these studies 
had not given them any clue to the 
functions of these genes. Therefore, 
they set about selecting nodulation 
genes that could complement their 
nodulation-defective R. meliloti 
mutants. This work was further com¬ 
plicated by the fact that, at that time, 
clone banks were almost always con¬ 
structed and maintained in E. coli. 
Therefore, to facilitate the assembly of 
the R. meliloti clone bank and its sub¬ 
sequent transfer to nodulation-defec¬ 


tive mutant strains of R. meliloti, the 
researchers first constructed a broad- 
host-range cosmid vector in which 
there were large DNA fragments, with 
an average size of approximately 23 
kb, that could be stably maintained in 
a number of different gram-negative 
bacteria. The large size of the DNA on 
the cosmid increased the likelihood 
that other genes involved in nodula¬ 
tion would be on the same DNA frag¬ 
ment as the complementing DNA 
sequence. After the cosmids had been 
transferred by conjugation to R. meli¬ 
loti, the transformed bacteria were 
tested for their ability to nodulate 
alfalfa plants. In previous experi¬ 
ments, it had been found that even 
one nodulation-proficient bacterium in 
the presence of 10 4 defective ones 
could successfully nodulate alfalfa 
plants. Bacteria with cosmids carrying 
a complementing gene were isolated 
directly from the nodules. 

In summary, these researchers were 
the first to isolate nodulation genes. 
They devised a clever and effective 
selection scheme. The broad-host- 
range cosmid vector that they devel¬ 
oped has since been used many times 
by a large number of other 
researchers. 


protein gene. Following the sequence determination of the PCR-amplified 
DNA, inverse PCR (Fig. 15.16) was used to obtain the remaining portion of 
the gene. It should be possible to transfer this gene to various strains of 
plant growth-promoting bacteria to create strains that can persist and pro¬ 
liferate at cold temperatures. Although there is currently no evidence to 
link antifreeze activity with the mechanism that bacteria use to function at 
low temperatures, it will be interesting to examine experimentally whether 
antifreeze protein activity is part of the adaptive strategy used by some 
bacteria for cold, as well as freezing, tolerance. 

Ethylene 

Fungal pathogens not only directly inhibit plant growth, they also cause 
the plant to synthesize stress ethylene, which causes some of the damage 
sustained by plants infected with fungal phytopathogens. For example, it 
is well known that exogenous ethylene often increases the severity of a 
fungal infection, whereas ethylene synthesis inhibitors significantly 
decrease the severity of a fungal infection. 
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TABLE 15.5 Reduction in the severity of damage (damping off) of cucumber caused 
by P. ultimum 

Treatment 

Seed germination 
rate (%) 

Shoot fresh 
weight (g) 

Root fresh 
weight (g) 

None 

100 

3.48 

2.85 

P. ultimum 

31 

0.69 

0.21 

CHAO + P. ultimum 

79 

2.11 

0.95 

CHAO/ACC + P. ultimum 

87 

2.27 

1.27 


Adapted from Wang et al., Can. J. Microbiol. 46:898-907, 2000. 

CHAO, the biocontrol bacterium P. fluorescens CHAO; CHAO/ACC, the biocontrol bacterium P. 
fluorescens CHAO transformed with the ACC deaminase gene from E. cloacae UW4; UW4, E. cloacae UW4. 
The expression of a foreign ACC deaminase gene in P. fluorescens CHAO results in an increase in the number 
of cucumber seeds that germinate in the presence of the fungal pathogen, as well as an increase in the fresh 
weight of both shoots and roots of the resulting plants. 


As mentioned earlier, stress ethylene that is synthesized in response to 
fungal pathogen infection is produced in two peaks, a small one that occurs 
within a few hours after fungal infection and a much larger peak that 
occurs several days after fungal infection (Fig. 15.4). The first peak turns on 
the transcription of genes that encode proteins that protect plants against 
the pathogen, while the second peak is deleterious to the plant. Ideally, it 
would appear to be advantageous to allow the plant to synthesize the first, 
but not the second, peak of ethylene in response to a fungal pathogen. This 
is readily achieved using ACC deaminase-containing plant growth-pro¬ 
moting bacteria, since high-level expression of ACC deaminase does not 
occur until several hours after the appearance of increased amounts of 
ACC. Thus, these bacteria do not alter the first ethylene peak but signifi¬ 
cantly decrease the magnitude of the second peak. 

A biocontrol bacterial strain, P. fluorescens CFIAO, was transformed with 
the P. putida UW4 ACC deaminase gene, and the effect of this manipulation 
on the damage to cucumbers caused by P. ultimum was assessed. The ACC 
deaminase-containing biocontrol bacterial strain was more effective in less¬ 
ening the damage than the wild-type biocontrol strain that did not possess 
the enzyme. Not only did plants inoculated with the ACC deaminase- 
transformed strains have greater root and shoot biomass than those treated 
with the wild-type biocontrol strain, but also the number of seeds that ger¬ 
minated in pathogen-containing soil was larger (Table 15.5). In addition, 
the ACC deaminase-transformed biocontrol strain reduced the extent of 
soft rot of potato slices, caused by the bacterial pathogen Erwinia carotovom 
subsp. carotovom, in sealed plastic bags by 50% compared with the wild- 
type biocontrol strain (Table 15.6). In effect, ACC deaminase acts synergisti- 
cally with other mechanisms of biocontrol, such as the production of 
antibiotics or antifungal enzymes, to prevent phytopathogens from dam¬ 
aging plants. 

Root Colonization 

Depending upon the mechanism that a particular biocontrol bacterium 
uses to thwart the damage to plants caused by pathogenic microorganisms, 
it can be advantageous for the biocontrol strain to bind as tightly as pos¬ 
sible to the plant root. One way to improve root colonization by biocontrol 
bacteria entails overexpressing the bacterial sss gene. This gene is normally 
thought to play a role in DNA rearrangements that regulate the transcrip- 
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tion of a gene(s) involved in the biosynthesis of cell surface components. 
When the sss gene was introduced on a plasmid into two strains of P. fluo- 
rescens, one that was normally a poor root colonizer and the other a good 
root colonizer, colonization of tomato roots was increased by approxi¬ 
mately 28- and 12-fold, respectively This work is an important first step in 
engineering more effective strains of biocontrol bacteria. 


Nitrogen Fixation 

Nitrogen gas (N 2 ), which makes up approximately 80% (by volume) of the 
air that we breathe, cannot be used directly by plants or animals to synthe¬ 
size essential nitrogen-containing biomolecules, such as amino acids and 
nucleotides. Rather, it must first be converted (fixed) into ammonia. This 
conversion requires a high input of energy because the triple bond of N 2 
(N=N) is extremely stable. The energy for the biological fixation of nitrogen 
comes from the hydrolysis of large amounts of ATR Similarly, the chemical 
(industrial) conversion of N, to ammonia uses a considerable amount of 
energy in the form of high temperature and pressure. 

More than 100 million tons of fixed nitrogen is needed annually to 
sustain global food production. Synthetic (chemically produced) fertilizers 
account for about half of this nitrogen supply, and most of the remainder is 
derived from diazotrophic bacteria. No eukaryote is known to fix nitrogen. 
Chemical fertilizers have helped considerably in increasing crop yields, but 
their continual use has led to pollution problems as a result of runoff and 
to depletion of the nutrient reserves in the soil. Moreover, their cost has 
been rising steadily. These factors have provided an incentive for devel¬ 
oping alternative sources of fixed nitrogen, including the development of 
diazotrophic microorganisms as "bacterial fertilizers." 

A wide range of bacteria can fix nitrogen, and a number of them have 
potential as crop fertilizers. However, until a bacterial fertilizer has been 
shown conclusively to be as effective as a chemical formulation, there will 
be reluctance to change current practices, especially in those countries 
where the cost of chemical fertilizer is not significant relative to the value 
of the crop. For example, soybeans, which constitute the second largest 
crop in the United States in terms of both cash value and total acres planted, 
form a beneficial symbiotic relationship with the bacterium Bradyrhizobium 
japonicum. In this symbiosis, the bacteria provide the plant with fixed 


TABLE 15.6 Effect of P. fluorescens CHAO and CHAO/ ACC 
on soft rot of potatoes by E. carotovora subsp. carotovora 

Treatment 

Weight of rotted 


potatoes (g/slice) 

None 

15.6 

CHAO 

14.5 

CHAO/ACC 

7.5 


Adapted from Wang et al.. Can. J. Microbiol. 46:898-907, 2000. 

For definitions of abbreviations, see Table 14.9. While the biocontrol 
strain P. fluorescens CHAO does not significantly alter the extent of damage 
to potatoes due to the bacterial pathogen, expression of ACC deaminase 
lowers the level of stress ethylene and decreases the damage to the potatoes 
by approximately 50%. The total weight of each potato slice was approxi¬ 
mately 20 g. 
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nitrogen and, in turn, receive photosynthetically fixed carbon from the 
plant. When plants are inoculated with specific strains of B. japonicum, the 
final yield of plant material can be increased by 25 to 50%, and the inocu¬ 
lated plants no longer require the addition of chemically fixed nitrogen. 
Although approximately 40% of the world's soybean crop is produced in 
just a few locales in the United States and agricultural practices tend to be 
similar throughout these locales, at present, only a small fraction of this 
crop is treated with B. japonicum. Most of these farmers continue to depend 
on the naturally occurring strains of B. japonicum in the soil and chemical 
fertilizers. 

The most important microorganisms that are currently used agricultur¬ 
ally to improve the nitrogen content of plants include a range of rhizobial 
genera and species (Table 15.7). These bacteria are gram negative, flagel¬ 
lated, and rod shaped, and they form symbiotic relationships with legumes. 
Generally, each rhizobial species is specific for a limited number of plants 
and will not interact with plants other than its natural hosts (Table 15.7). 

As part of their life cycle, rhizobial bacteria invade plant root cells and 
initiate a complex series of developmental changes that lead to the forma- 


TABLE 15.7 Plant specificities of various rhizobial species 

Bacterial species Host plant(s) 


Azorhizobium caulinodans 
Bradyrhizobium elkanii 

Bradyrhizobium japonicum 
Mesorhizobium amorphae 
Mesorhizobium ciceri 
Mesorhizobium chacoense 
Mesorhizobium huakuii 
Mesorhizobium loti 
Mesorhizobium meditteraneum 
Mesorhizobium tianshanense 
Rhizobium sp. strain NGR234 
Rhizobium etli 

R. etli bv. mimosae 
Rhizobium galegae 

Rhizobium gallicum 

Rhizobium huautlense 

Rhizobium leguminosarum bv. phaseoli 

R. leguminosarum bv. trifolii 

R. leguminosarum bv. viciae 

Rhizobium sullae 

Rhizobium tropici 

Sinorhizobium fredii 
Sinorhizobium meliloti 
Sinorhizobium morelense 


West African legume (Sesbania rostrata) 
Soybean (Glycine max), black-eyed pea 
(Vigna unguiculata subsp. dekindtiana), 
mung bean (Vigna radiata) 

Soybean (G. max) 

Desert false indigo (Amorpha fruticosa) 
Chickpea (Cicer arietinum) 

White carob tree (Prosopis alba) 

Chinese milk vetch (Astragalus sinicus) 
Lotus (Lotus japonicus) 

Chickpea (C. arietinum) 

7 Legume species 
>100 Tropical legume species 
Kidney bean (Phaseolus vulgaris), mung 
bean (V. radiata) 

Mimosa (Mimosa affinis) 

Goat's me (Galega officinalis, Galega 
orientalis) 

Common bean (P. vulgaris) 

Danglepod (Sesbania herbacea) 

Kidney bean, mung bean 
Clover (Trifolium spp.) 

Pea (Pisum sativum) 

Sweetvetch (Hedysarum coronarium) 
Mimosoid trees (Leucaena spp.) and 
some tropical legume trees 
(Macroptilium spp.) 

Soybean (G. max) 

Alfalfa (Medicago sativa) 

White popinac (Leucaena leucocephala) 
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tion of a root nodule. Inside the root nodule, the bacteria proliferate and 
persist in a form that has no cell wall. The bacteria within the nodules fix 
atmospheric nitrogen by means of the enzyme nitrogenase. The structural 
and biochemical interactions between a symbiotic rhizobacterium and a 
host plant are quite intricate and mutually beneficial. Inside a nodule, 
nitrogenase is protected from the toxic effects of atmospheric oxygen in two 
ways. First, oxygen does not readily diffuse into a nodule. Second, the 
oxygen content within a nodule is regulated by the protein leghemoglobin. 
The heme moiety of this oxygen-binding protein is synthesized by the bac¬ 
terium, and the globin portion of the molecule is encoded by a plant gene. 
The plant also provides the bacterium with photosynthetically fixed 
carbon, which the bacterium requires for growth. For its part, the plant 
benefits from this symbiotic relationship by receiving fixed nitrogen from 
the bacterium. 

Nitrogenase 

The renewed interest in diazotrophs as biological fertilizers overlapped the 
development of techniques for gene isolation and manipulation and pro¬ 
vided the impetus for studying the biochemical and molecular biological 
aspects of nitrogen fixation. Initially, scientists believed that these studies 
would lead to the development of improved nitrogen-fixing organisms that 
would enhance crop yields. Some researchers even went so far as to sug¬ 
gest that bacterial genes for nitrogen fixation might be introduced directly 
into plants to enable them to fix their own nitrogen. Although this overly 
optimistic prediction has not materialized, a detailed understanding of the 
process of nitrogen fixation has emerged. And with this understanding, the 
possibility of improving the nitrogen-fixing activity of some diazotrophs 
by genetic manipulation is a little closer to becoming a reality. 

Components of Nitrogenase 

All knownnitrogenases have two oxygen-sensitive components. Component 
I is a complex of two identical a-protein subunits (approximately 50,000 
daltons each), two identical (3-protein subunits (approximately 60,000 dal- 
tons each), 24 molecules of iron, 2 molecules of molybdenum, and an iron- 
molybdenum cofactor, often called FeMoCo (Fig. 15.17). Component II has 
two a-protein subunits (approximately 32,000 daltons each), which are not 
the same as the a-protein subunits of component I, and a number of associ¬ 
ated iron molecules. The catalysis of nitrogen to ammonia requires the 
combination of components I and II, a complex of magnesium and ATP, 
and a source of reducing equivalents (reaction 1; the upward-pointing 
arrow indicates a gas and P; is inorganic phosphate). 

N 2 + 8FP + 8e- + 16MgATP -> 2NH 3 + H 2 T + 16MgADP + 16^ (1) 

In addition to fixing nitrogen, the nitrogenase can reduce the gas acetylene 
to ethylene (reaction 2). 

H—C=C—H + 2FP H 2 =C=C=H 2 (2) 

The measurement by gas chromatography of ethylene production as a func¬ 
tion of time provides a convenient assay for nitrogenase activity. This assay 
can be performed with intact cells in solution (Fig. 15.18), bacteria associated 
with plant roots, crude cell extracts, or highly purified enzyme preparations. 
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FIGURE 15.17 Structure of the iron-molybdenum cofactor bound to a molecule of 
dinitrogen (N 2 ). 


Component I catalyzes the actual reduction of N 2 , and component II donates 
electrons to component I. Both components are extremely sensitive to 
oxygen and can be rapidly and irreversibly inactivated when the oxygen 
concentration is too high. In addition to components I and II, the activity of 
a complete, functional nitrogenase depends on 15 to 20 additional accessory 
proteins. The roles of most of the accessory proteins have been delineated 
and include the transfer of electrons to component II and the biosynthesis of 
the iron-molybdenum cofactor that is a part of component I. 

Genetic Engineering of the Nitrogenase Gene Cluster 

Nitrogen fixation is a very complicated process requiring the concerted 
actions of a large number of different proteins. Therefore, it was not real¬ 
istic to expect either that an intact single DNA fragment containing all the 
genetic information for nitrogen fixation could be readily cloned from a 
diazotrophic microorganism and transferred into a nondiazotrophic 
organism or that a recipient organism could maintain the physiological 
conditions needed for nitrogenase activity. Consequently, the most direct 
way to isolate the genes involved in nitrogen fixation (nif genes) was to 
identify and characterize those clones of a wild-type library that restore 
nitrogen fixation to various mutants of the original organism. This process 
is called genetic complementation. 

The first nif genes identified by complementation were isolated from 
clone banks of the diazotroph Klebsiella pneumoniae. This well-studied 
organism is found in soil and water, as well as in the human intestine. The 
isolation protocol comprises the following steps (Fig. 15.19). 

1. K. pneumoniae cells are treated with a dose of a mutagenic agent 
that allows approximately 0.1 to 1.0% of the cells to survive. Some 
of the mutagenized cells are able to grow on a minimal medium 
containing a source of fixed nitrogen, such as NH 4 C1, but do not 
grow in the absence of fixed nitrogen. These cells are likely to have 
a mutation in a nif gene and are designated Nik. 

2. A clone bank that consists of chromosomal DNA from wild-type 
(Nif + ) K. pneumoniae cells is constructed in a broad-host-range 
plasmid expression vector and maintained in Escherichia coli. 

3. The Nik K. pneumoniae cells are conjugated with the E. coli cells that 
carry the clone bank on a plasmid shuttle vector. 
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FIGURE 15.18 Assay for nitrogenase activity based on the conversion of acetylene to 
ethylene. (A) The source of the nitrogenase enzyme, for example, bacteria growing 
in culture, bacteria associated with plant roots, or a purified enzyme preparation 
(not shown), is placed in a sealed container under an atmosphere of acetylene. (B) 
Samples are periodically withdrawn from the sealed container, and the levels of 
acetylene and ethylene are measured by gas chromatography. The extent of nitro¬ 
genase activity is proportional to the amount of ethylene produced. 


4. The transformed K. pneumoniae cells are selected for the acquisition 
of the Nif + phenotype by plating them onto a minimal medium 
that does not contain a source of fixed nitrogen. The only cells that 
grow under these conditions are Nit K. pneumoniae cells containing 
a plasmid encoding and expressing the protein that is either 
missing or nonfunctional in the Nit mutant. 






































624 


CHAPTER 15 


Broad-host-range Nif + K. pneumoniae -► Nif K. pneumoniae 



Nif + K. pneumoniae 
transformants 

FIGURE 15.19 Procedure for isolating nif genes by genetic complementation. A clone 
bank that was constructed with Nif + K. pneumoniae DNA is used to complement a 
Nif" K. pneumoniae strain. Transformants are selected for growth on minimal 
medium that does not contain fixed nitrogen. 


The DNA fragment in the plasmid that complements the Nif" chromo¬ 
somal mutation contains a nif gene that can be characterized more thor¬ 
oughly and used to isolate other nif genes. 
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nif gene cluster 



Partial restriction 
enzyme digest 



FIGURE 15.20 Partial restriction endonuclease digestion of the region of the chromo¬ 
somal DNA encoding nif genes. The nif gene that was isolated by genetic comple¬ 
mentation is used as a DNA hybridization probe (red). The probe and the DNA 
fragments from the partial digest are ordered as they would be in the chromosomal 
DNA. Colony hybridization of a clone bank consisting of the illustrated fragments, 
using the probe shown, would be expected to yield clones containing plasmids with 
fragments C and D. Using fragments C and D separately as probes to screen the 
same clone bank would be expected to yield clones containing plasmids with frag¬ 
ments B and E, respectively. Eventually, in this way, a large contiguous section of 
the host chromosome is isolated as a set of overlapping clones. 


Two approaches have been used to isolate other genes that are involved 
in the nitrogen fixation process. First, the K. pneumoniae clone bank has been 
used to complement a series of independently derived Nib mutants, 
increasing the likelihood that in each case a different nif gene will be iso¬ 
lated. Second, isolated nif genes have been used as DNA hybridization 
probes, which have then been used to screen a K. pneumoniae chromosomal 
DNA clone bank that carries large (7- to 10-kb) inserts (Fig. 15.20). The 
premise behind the latter scheme is based on the observation that in 
prokaryotic organisms many of the genes involved in one pathway are clus¬ 
tered on the chromosomal DNA and are often arranged in operons. Thus, 
DNA hybridization enables investigators to identify clones containing addi¬ 
tional nif genes that are adjacent to the sequence initially isolated. 

As a result of a considerable amount of research, the entire set of nif 
genes from K. pneumoniae has been isolated and characterized. These genes 
are arranged in a single cluster that occupies approximately 24 kb of the 
bacterial genome (Fig. 15.21). The cluster contains seven separate operons 
that together encode 20 distinct proteins (Table 15.8). All of the nif genes 
must be transcribed and translated in a concerted fashion, under the regu¬ 
latory control of the nif A and nifL genes, to produce a functional nitroge- 
nase. The NifA protein is a positive regulatory factor. It turns on the 
transcription of all of the nif operons except its own by binding to a specific 
DNA sequence (5'-TGT-N 10 -ACA-3') that is part of each promoter of each 
nif operon. There is a site on the DNA approximately 80 to 150 nucleotides 
upstream of each transcriptional start site where the NifA protein binds. 
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FIGURE 15.21 Arrangement of the genes in the K. pneumoniae nif gene cluster and 
some of the functions that they encode. The nif genes are represented by italic 
uppercase letters; each red arrow below sets of these letters denotes a specific nif 
operon and the direction of its transcription. The arrows pointing up and away 
from the gene designations show how some of the various gene products partici¬ 
pate in nitrogen fixation. F, flavodoxin molecule; FO, pyruvate-flavodoxin oxi- 
doreductase; FeMoCo, the iron-molybdenum cofactor; CoA, coenzyme A. 


The DNA-bound Nif A protein then interacts with a specific transcription 
initiation protein called sigma-54 (a 54 ) before transcription from the nif 
promoter is initiated. The NifL protein is a negative regulatory factor. In the 
presence of either oxygen or high levels of fixed nitrogen, it acts as an 
antagonist of the Nif A protein and, as a result, turns off the transcription of 
all other nif genes. 

The bacterium K. pneumoniae does not make a major contribution to the 
overall global biological fixation of nitrogen. Therefore, to genetically engi¬ 
neer nitrogen fixation in soil bacteria that are more important in promoting 
plant growth, other nif genes have been cloned and characterized. To do 
this, the nif genes from K. pneumoniae have been used as DNA hybridization 
probes to isolate nif genes from clone banks of other diazotrophic microor¬ 
ganisms. Most diazotrophic organisms have a similar array of genes 
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encoding their nitrogen-fixing apparatus, and the DNA sequences of these 
genes do not vary much from one organism to another. 

It may be possible to increase the amount of nitrogen fixed by diaz- 
otrophic organisms by manipulating the nifA and nifL genes. After 
researchers genetically engineered extra copies of the nifA gene into a 
Sinorhizobium meliloti strain, alfalfa plants that were inoculated with this 
transformant grew larger and produced more biomass than plants that 
were treated with the nontransformed strain. Similarly, it may be possible 
to engineer the nifL gene so that the NifL protein, the negative regulator, 
is less sensitive to the presence of fixed nitrogen. With this kind of dereg¬ 
ulation, an organism would fix more nitrogen for its plant partner. 
However, not all nitrogen-fixing organisms have a NifL protein, so this 
sort of manipulation may be limited to only certain bacterial strains. In 
some organisms, the essential regions of NifL are part of NifA. Moreover, 
increasing the amount of nitrogen that an organism can fix also increases 
the amount of energy, usually in the form of fixed carbon, that is needed 
to power its metabolism. Consequently, an engineered microorganism 
that can fix a higher than normal level of nitrogen may lose its effective¬ 
ness as a plant growth-promoting agent because of a diminished growth 
rate. 

Because of the complexity of nitrogen fixation by microorganisms, the 
simple addition of one or two nif genes will not confer on a nondiaz- 
otrophic recipient cell the ability to fix nitrogen. Moreover, genetic modi¬ 
fication of plants with the entire 24-kb nif gene cluster would not be 
effective because the normal level of oxygen in the host cell would inacti¬ 
vate nitrogenase, and if this level were reduced, the host plant cell would 
probably die. In addition, the engineering of nitrogen fixation in plant 
cells requires resolving major, if not insurmountable, transcriptional, 
translational, and regulatory problems. For example, it is difficult to con¬ 
ceive how the regulation of nitrogen fixation could be achieved, since 
there are no plant promoters that respond to the Nif Aprotein. Consequently, 
nif genes would not be turned on in such a transgenic plant. Each of the nif 
genes would also have to be under the control of separate promoters 
because plant cells cannot process multigene transcripts. The introduction 
of a functional nitrogen fixation capability into plants is therefore 
extremely unlikely. 


TABLE 15.8 K. pneumoniae genes involved in nitrogen fixation 
and the functions of the proteins that they encode 


nif gene 

Function 

D 

Nitrogenase component I a subunit 

K 

Nitrogenase component I (3 subunit 

H 

Nitrogenase component II 

F 

Flavodoxin 

J 

Pyruvate:flavodoxin oxidoreductase 

Q, B, N, E, V 

FeMoCo synthesis 

M 

Processing of dinitrogenase reductase 

A 

Positive activator 

L 

Negative regulator 

S 

Maturation of component I 

W, Z, T, Y, U, X 

Other, less well-defined functions 
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FIGURE 15.22 Schematic representation of the partitioning of rhizobial intracellular 
glucose between glycogen synthesis and respiration. A mutation in the gene for 
glycogen synthase prevents glycogen from being synthesized, so that all of the 
glucose enters the tricarboxylic acid (TCA) cycle. In this cycle, the acetyl group of 
acetyl coenzyme A (CoA) is enzymatically degraded to form carbon dioxide and 
hydrogen. The hydrogen (or the corresponding electrons) is fed into the electron 
transport chain, and a large portion of the energy released is conserved by the phos¬ 
phorylation of ADP to ATP. The ATP is then available to "power" a large number of 
energy-requiring metabolic processes, including nitrogen fixation. 


Engineering Improved Nitrogen Fixation 

Engineering oxygen levels. The concentration of oxygen is a critical factor 
in determining the amount of nitrogen that is fixed by a rhizobial strain. On 
one hand, oxygen is inhibitory to nitrogenase and is a negative regulator of 
nif gene expression. On the other hand, oxygen is required for bacteroid 
respiration. This conundrum can be resolved by the introduction of leghe- 
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moglobin, which binds free oxygen tightly (Fig. 15.23) so that both the 
transcription of nif genes and the functioning of nitrogenase can proceed 
unimpaired. In fact, the addition of exogenous leghemoglobin to isolated 
bacteroids results in a dramatic increase in nitrogenase activity. Thus, it is 
possible to engineer more efficient strains of Rhizobium by overproducing 
leghemoglobin. Alternatively, since the globin portion of leghemoglobin is 
produced by the plant, it may be more efficient to transform rhizobial 
strains with genes encoding a bacterial equivalent of leghemoglobin. 

Following the transformation of a strain of Rhizobium etli with a broad- 
host-range plasmid carrying the Vitreoscilla sp. (a gram-negative aerobic or 
microaerophilic bacterium) hemoglobin gene at low levels of dissolved 
oxygen (0.25 to 1.0%) in the growth medium, the rhizobial cells had a two- 
to threefold-higher respiratory rate than the nontransformed strain. These 
data suggest that free-living R. etli with a Vitreoscilla sp. hemoglobin gene 
may have a competitive advantage over nontransformed rhizobial strains 
in soil (which usually has a low level of oxygen). As has been observed in 
the laboratory for numerous other free-living bacteria, the hemoglobin- 
containing strain can grow to a greater extent because it is able to sequester 
oxygen and provide it to the reactions, where it is necessary for bacterial 
metabolism (nitrogenase activity is at its peak at this time). 

In greenhouse experiments, when bean plants were inoculated with 
either nontransformed or hemoglobin-containing R. etli, the plants inocu¬ 
lated with the hemoglobin-containing strain had approximately 68% more 
nitrogenase activity. This difference in nitrogenase activity leads to a 25 to 
30% increase in leaf nitrogen content at 40 to 50 days after infection and a 
16% increase in the nitrogen content of the seeds that are produced. Thus, 
the expression of a bacterial hemoglobin gene may be advantageous to 
Rhizobium bacteria both when they are free-living and when they are in 
bacteroids as part of a symbiotic relationship with their host plant. 

Modulating nifH and poly-fl-Hydroxybutyrate. In Mexico, most of the bean 
plants (the second most important crop in Mexico) are nodulated by R. etli. 


FIGURE 15.23 (A) R. etli cells engineered to express a Vitreoscilla sp. hemoglobin gene 
bind low levels of dissolved oxygen, either from solution or from the soil. (B) The 
Vitreoscilla sp. hemoglobin protein binding to dissolved oxygen. 
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The most common strain of R. etli encodes three copies of the nitrogenase 
reductase ( nifH ) gene, each under the transcriptional control of a separate 
promoter. To increase the amount of nitrogenase, the strongest of the three 
nifH promoters (i.e., PnifHc) was coupled to the nifHcDK operon, which 
encodes the nitrogenase structural genes (where nifHc is one of the three 
nifH genes in this bacterium). The nifHc promoter is typically induced 
during nodule development. The PnijHc-nifHcDK construct was cloned 
into a broad-host-range plasmid and introduced into the wild-type strain 
of R. etli. The net result of this genetic manipulation was a significant 
increase in nitrogenase activity, plant dry weight, seed yield, and the 
nitrogen content of the seeds (Table 15.9). Moreover, this genetic manipula¬ 
tion worked as well or better when the PnijHc-nifHcDK construct was 
introduced into the large Sym plasmid from R. etli that contains all of the 
genetic information for nodulation and nitrogen fixation. 

Biological nitrogen fixation requires a large amount of energy in the 
form of ATP. Thus, any mutation or genetic manipulation that increases the 
flux of carbon sources consumed by a bacterium through the citric acid 
cycle should be beneficial for nitrogen fixation (Fig. 15.22 and 15.24). This 
is because metabolism of glucose via the citric acid cycle results in the pro¬ 
duction of ATP. Consistent with this principle, it was observed that expres¬ 
sion of the PnifH-nifH c DK construct in a poly-(3-hydroxybutyrate-negative 
strain of R. etli enhanced plant growth to an even greater extent than when 
this construct was expressed in a wild-type poly-(3-hydroxybutyrate- 
positive strain. Finally, since no foreign genes were introduced into R. etli, 
the scientists who constructed these strains hope that the regulatory bodies 
in their country will view the manipulated strains as benign and approve 
them for widespread environmental use. 


Hydrogenase 

An undesirable side reaction of nitrogen fixation is the reduction of H + to 
H 2 (hydrogen gas) by nitrogenase. Energy in the form of ATP is wasted on 
the production of hydrogen, which is eventually lost to the atmosphere. 
Because of this side reaction, only 40 to 60% of the electron flux through 
the nitrogenase system is transferred to N,, thereby significantly lowering 
the overall efficiency of the nitrogen-fixing process. Theoretically, if H 2 
could be recycled to H + , the extent of energy loss could be diminished and 
the nitrogen-fixing process would become more efficient. It is probably 
impossible to prevent this side reaction directly, because it is a conse¬ 
quence of the chemistry of the active site of the nitrogenase; hence. 


TABLE 15.9 Symbiotic performance of genetically modified strains of R. etli in concert with common 
beans (Phaseolus vulgaris) 


Bacterial strain 

Nitrogenase activity 
(|imol of ethylene/h/g 
of nodule) 

Plant dry wt 
(g/plant) 

Seed yield 
(g/plant) 

Seed N content 
(mg of N/g 
of seed) 

Wild type 

64.5 

0.54 

1.43 

33.9 

Wild type + extra nifHDK 

72.7 

0.66 

1.56 

41.4 

Wild type + nifH c 

77.3 

0.75 

1.73 

31.2 

Wild type + PnifH c -nifH c DK 

108.2 

0.81 

2.50 

43.6 


Adapted from Peralta et al., Appl. Environ. Microbiol. 70:3272-3281, 2004. 
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Poly-fS-hydroxybutyrate 


FIGURE 15.24 Conversion of glucose to acetyl coenzyme A (CoA), which can either 
enter the tricarboxylic acid (TCA) cycle, with the concomitant production of ATP, or 
be converted into the carbon storage polymer poly-p-hydroxybutyrate. 


blocking the side reaction by altering nitrogenase would concomitantly 
inhibit nitrogenase activity 

Hydrogen Metabolism 

In the mid-1970s, it was discovered that some strains of B. japonicum could 
use hydrogen as an energy source for growth under microaerophilic (low- 
oxygen-concentration) conditions. These strains have an enzyme called 
hydrogenase that is able to take up H 2 from the atmosphere and convert it 
into H + (Fig. 15.25). Experiments were undertaken to test whether the pres¬ 
ence of hydrogenase in B. japonicum had an impact on the growth of soy¬ 
bean plants. Plants inoculated with strains that produce hydrogenase 
(Hup + ) had more biomass and nitrogen than plants that were treated with 
non-hydrogenase-producing (Hup ) strains, despite higher levels of nitro¬ 
genase activity in the Hup - strains (Table 15.10). From this and similar 
experiments, it was concluded that the presence of a hydrogen uptake 
system in a symbiotic diazotroph, such as B. japonicum, improves its ability 
to stimulate plant growth, presumably by binding and then recycling the 
hydrogen gas that is formed inside the nodule by the action of nitrogenase 
(Fig. 15.25). Within the nodule, the contribution of atmospheric hydrogen is 
negligible. 

Although it is clearly beneficial to the plant to obtain its nitrogen from 
a symbiotic diazotroph that has a hydrogen uptake system, this trait is not 
common in naturally occurring rhizobial strains. In one study, it was found 
that the majority of naturally occurring Rhizobium and Bradyrhizobium 
strains examined were Hup - (Table 15.11). In that study, the data were 
based on a small number of strains for each species except B. japonicum, for 
which over 1,400 strains were assayed. The conclusion that can be drawn 
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FIGURE 15.25 Recycling of the hydrogen gas that is produced as a by-product of 
nitrogen fixation. Hydrogen is generated by nitrogenase at the expense of ATP, but 
by using this pathway, the hydrogen can be recaptured by hydrogenase. 


from this work is that commercial Hup rhizobial strains are prime candi¬ 
dates for transformation to a Hup + phenotype. 

Genetic Engineering of Hydrogenase Genes 

Although a considerable amount of effort has been directed over the past 
30 years or so toward studying hydrogenases from both diazotrophic and 
nondiazotrophic microorganisms, an in-depth understanding of the struc¬ 
tures and functions of these enzymes remains elusive. Many organisms 
have more than one hydrogenase, and many hydrogenases consist of more 
than a single polypeptide chain. Some hydrogenases are active only in the 
uptake of hydrogen from the atmosphere, whereas others, depending on 
the conditions, can also synthesize hydrogen. One result of this complexity 
is that the conversion of a Hup - strain of Rhizobium into a Hup + strain may 
not be readily achieved by the introduction of just any hydrogenase gene. 
Rather, the introduced gene(s) must encode all of the enzyme's subunits 
and must be able to interact with the appropriate electron transport mole¬ 
cule within the host organism. 

The most common strategy for isolating hydrogenase genes has been 
genetic complementation. The first hydrogenase gene to be isolated was 


TABLE 15.10 Relative enzyme activities and growth-stimulating performance 
of a parental Hup + B. japonicum strain (SR) and three Hup - mutants (SRI, SR2, 
and SR3) 


B. japonicum 
strain 

Relative 

nitrogenase 

activity 

Relative 

hydrogenase 

activity 

Relative 
plant dry 
weight 

Relative 

nitrogen 

content 

SR 

1.00 

1.00 

1.00 

1.00 

SRI 

1.27 

0.01 

0.81 

0.93 

SR2 

1.13 

0.01 

0.74 

0.91 

SR3 

1.23 

0.01 

0.65 

0.85 


Adapted from Albrecht et al., Science 203:1255-1257,1979. 

Nitrogenase activity was assessed by monitoring the amount of acetylene that was reduced to ethylene 
as a function of time. Hydrogenase activity was measured by means of a hydrogen electrode. Plant dry 
weight included the weights of both the leaf material and root material. The nitrogen content was calcu¬ 
lated as the fraction of the dry weight of the plant that was nitrogen. All values have been normalized 
relative to the parental strain. 
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TABLE15.11 Percentages of native Rhizobium, Sinorhizobium, and Bradyrhizobium 
strains that have functional hydrogen uptake systems (Hup + ) 


Bacterium 

Hup* strains (%) 

Rhizobium leguminosarum bv. leguminosarum 

9.3 

Rhizobium leguminosarum bv. trifolii 

0 

Rhizobium leguminosarum bv. phaseoli 

0 

Sinorhizobium meliloti 

21 

Bradyrhizobium japonicum 

21 

Bradyrhizobium sp. 

91 


Adapted from Evans et al., Annu. Rev. Microbiol. 41:335-361,1987. 


the gene for an E. coli membrane-bound hydrogenase, and it was selected 
by complementation of an E. coli mutant that did not express this activity. 

After the work with £. coli, hydrogenase (hup) genes from B. japonicum 
were isolated from a clone bank of wild-type DNA constructed in the 
broad-host-range cosmid vector pLAFRl by complementation of B. 
japonicum Hup - mutants. The presence of a hydrogenase that takes up 
hydrogen from the atmosphere in the complemented Hup - mutant strains 
was indicated by the ability of the active hydrogenase to reduce the dye 
methylene blue in a hydrogen atmosphere. More detailed studies of the B. 
japonicum hup genes showed that they were organized into at least two, and 
possibly three, transcriptional units covering approximately 20 kb of the 
genome and including 18 separate genes. Subsequent work on the hup 
genes from Rhizobium leguminosarum has indicated that these genes are 
similar in both DNA sequence and gene organization to the hup genes from 
B. japonicum. Thus, the isolated hup genes from B. japonicum may be used as 
DNA hybridization probes to select homologous genes from a clone bank 
of R. leguminosarum. 

Following the isolation of R. leguminosarum hup genes, and despite the 
complexity of this system, it has been possible to use cosmid vectors to 
transfer a complete set of uptake hydrogenase genes from a Hup + strain of 
R. leguminosarum to a Hup - strain (Table 15.12). Plants treated with R. legu¬ 
minosarum that had been transformed to Hup + grew larger and contained 
more nitrogen than the plants inoculated with the Hup - parental strain 
(Table 15.12). Although hydrogenase genes have not received as much 
attention as nif genes, this simple gene transfer experiment is a convincing 
demonstration of the use of genetic manipulation to improve the ability of 
a diazotroph to stimulate plant growth. 

More recently, one group of scientists modified the hup gene promoter 
in R. leguminosarum and in the process engineered a more efficient rhizobial 


TABLE 15.12 Plant growth and nitrogen assimilation after the introduction of hup 
genes into a Hup - strain of R. leguminosarum 


Bacterial 

Relative 

Relative 

Relative 

Relative 

phenotype 

plant dry 
weight 

nitrogen 

amount 

leaf area 

nitrogen 

concentration 

Hup - 

1.00 

1.00 

1.00 

1.00 

Hup + 

1.35 

1.52 

1.53 

1.15 


Adapted from Brewin and Johnston, U.S. patent 4,567,146, January 1986. 
The data have been normalized relative to the Hup - parental strain. 






strain. In R. legnminosarum, 18 genes are associated with hydrogenase 
activity. There are 11 hup genes (Fig. 15.26) responsible for the structural 
components of the hydrogenase, the processing of the enzyme, and elec¬ 
tron transport. There are also seven hyp (hydrogenase pleitropic) genes, 
which are involved in processing the nickel that is part of the active center 
of the enzyme. The hup promoter is dependent on the NifA protein (which 
is also required to activate the synthesis of nif genes), so that hup genes are 
expressed only within bacteroids. On the other hand, the hyp genes are 
transcriptionally regulated by an FnrN-dependent promoter, which is 
turned on by low levels of oxygen. Thus, the hyp genes are expressed both 
in bacteroids and microaerobically. By modifying the chromosomal DNA of 
R. leguminosarum and exchanging the hup promoter for an FnrN-dependent 
promoter (Fig. 15.26), a derivative of the original bacterium with an 
increased level of hydrogenase was created (Table 15.13). The engineered 
R. leguminosarum strain displayed a twofold increase in hydrogenase 
activity compared to the wild type, and no discernible amount of hydrogen 
gas was produced as a by-product of nitrogen fixation. This is expected to 
make this strain of R. leguminosarum much more effective at promoting 
plant growth and increasing plant nitrogen content. Moreover, regardless 
of whether nickel was added to the system, the amount of hydrogen 
evolved from nitrogen-fixing nodules was extremely low, indicating that 
virtually all of the hydrogen produced by nitrogenase was recycled. The 
reason that nickel was added to this system is that in many soils, the avail¬ 
ability of nickel limits hydrogenase activity. In some soils, the level of 
nickel is so low that even if a naturally occurring strain contains hydroge¬ 
nase genes, the hydrogenase activity may be so low as to be ineffectual. On 


FIGURE 15.26 Replacement of the NifA-dependent R. leguminosarum hup promoter 
with an FnrN-dependent promoter. The R. leguminosarum hyp gene cluster is 
already under the transcriptional control of an FnrN-dependent promoter. 
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TABLE 15.13 Biological activities of wild-type and engineered (replacing the hup pro¬ 
moter with an FnrN-dependent promoter) strains of R. leguminosarum 


Strain 

Bacteroid hydrogenase activity 
(nmol of H^h/mg of protein) 

Nodule H 2 evolution (mmol/H 2 /g 
[fresh weight] of nodule) 


-Ni 

+Ni 

-Ni 

+Ni 

Wild type 

1,080 

2,930 

2.13 

<0.25 

Engineered 

strain 

2,210 

5,060 

0.34 

<0.25 


Adapted from Brito et al., Appl. Environ. Microbiol. 68:2461-2467, 2002, and Ureta et al., Appl. Environ. 
Microbiol. 71:7603-7606, 2005. 


the other hand, when the engineered R. leguminosarum was tested with 
various field soils, hydrogenase overproduction invariably overcame the 
limitation of low nickel levels, with the net result that the amount of fixed 
nitrogen, and hence plant productivity, was greater. 


Modulation 

Competition among Nodulating Organisms 

A major goal of agricultural biotechnology research is the development, by 
genetic manipulation, of Rhizobium strains that can increase plant produc¬ 
tivity more effectively than naturally occurring strains. Many commercial 
inoculant strains that have been developed by mutation and selection to be 
superior nitrogen fixers are not very effective at establishing nodules on 
host plant roots when placed in competition with Rhizobium strains that are 
already present in the soil. Conversely, although many of the strains that 
are indigenous to the soil are highly successful in establishing nodules in 
competitive situations, they are not especially efficient at nitrogen fixation. 
Therefore, to make use of the commercial inoculant strains, either the 
nodulation capability of these strains must be enhanced or indigenous 
rhizobial strains must be inhibited. 

Studies were undertaken to determine the genetic basis of this "com¬ 
petitiveness" with the aim of adding these particular genes to the strains 
that are used as inoculants. The nature of the competitive advantage of soil 
rhizobial species is not known, but it was reasoned that the indigenous 
bacteria might be more efficient at nodulation and, as a consequence, might 
prevent an inoculated strain from becoming established and forming its 
own nodules. 

Genetic Engineering of Nodulation Genes 

When scientists first attempted to isolate nodulation (nod) genes, the 
absence of any specific information about the biochemical or genetic basis 
of nodulation meant that a strategy had to be devised for the identification 
of the genes. Therefore, once again, genetic complementation was used. 
Nodulation-defective (Nod ) mutants of S. meliloti were transformed with 
a clone bank of wild-type chromosomal DNA from S. meliloti, and those 
colonies that had acquired the ability to nodulate alfalfa roots were isolated 
(Fig. 15.27). More specifically, the steps of the procedure were as follows. 

1. A clone bank of wild-type (Nod + ) S. meliloti was constructed by 
partial digestion of R. meliloti DNA with EcoRI and insertion into 
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the unique EcoRI site of the broad-host-range cosmid pLAFRl, 
which can carry up to 40 kb of cloned DNA. 

2. The clone bank was packaged into bacteriophage A, introduced 
into E. coli, and subsequently transferred to Nod - mutants of S. 
meliloti by conjugation. The vector carries a tetracycline resistance 
gene that can be used as a selectable marker in both £. coli and S. 
meliloti. 

3. After conjugation, suspensions of 200 to 300 transformed S. meliloti 
cells were tested for the ability to nodulate sterile alfalfa plants, 
with the expectation that only those transformants that carried and 
expressed a gene that complemented the nodulation defect in the 
S. meliloti host would produce nodules. 

4. The bacteria that formed nodules on the test plants were recovered 
from within the nodules. These bacteria were then grown in cul¬ 
ture and used to retrieve the vector carrying the complementing 
gene. The specific portion of the large insert DNA that carried the 
complementing gene was then subcloned onto another plasmid 
vector and analyzed further. 

5. Once a single nodulation gene was identified, it was used as a 
DNA hybridization probe to identify adjacent regions of S. meliloti 
chromosomal DNA in a genomic library (Fig. 15.20). 

The complete repertoire of nodulation genes from S. meliloti has been 
characterized. Detailed biochemical and genetic studies have revealed that 
nodulation and its regulation are complex processes that require the func¬ 
tioning of a large number of genes (Table 15.14). Some of the nodulation 
genes are highly conserved (common) among nodulating microorganisms, 
and others are species specific. The nod genes are grouped into three sepa¬ 
rate classes: common genes, host-specific genes, and the regulatory nodD 
gene. Thus, for example, the nodABC genes are common to all Rhizobium 
species and are structurally interchangeable. In most species, the nodABC 
genes are found on a single operon. 

A number of events are now known to occur during nodulation. First, 
the nodD gene product, which is constitutively expressed, recognizes and 
binds to a flavonoid molecule, which is excreted by the roots of the poten¬ 
tial host plants. Flavonoids are a class of plant phenolic molecules with a 
basic structure that consists of 15 carbons arranged as two aromatic rings 
connected by a 3-carbon bridge. They perform a number of different func¬ 
tions for the plant, such as pigmentation and defense against fungi or 
insects. The binding of flavonoids to the NodD gene product is one of the 
major determinants of rhizobial host specificity, because each rhizobial spe¬ 
cies recognizes only a limited number of flavonoid structures and each 
plant species produces its own specific set of flavonoid molecules (Table 
15.15). In a limited number of instances, other small organic molecules, 
such as aldonic acids and betaines, that are exuded by plant roots or germi¬ 
nating seeds and are present in large amounts can interact with the NodD 
protein. Some strains, such as R. leguminosarum biovar (bv.) trifolii, have a 
very narrow host range, responding to only a few kinds of flavonoids, 
while others, such as Rhizobium sp. strain NGR234, have a very broad host 
range and respond to a much larger number of different flavonoids. 

The binding of a flavonoid molecule activates the NodD gene product, 
presumably causing it to undergo a conformational change, and enables 
the flavonoid-NodD complex to attach to a nodulation promoter element 
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FIGURE 15.27 Procedure for the isolation of S. meliloti 
nodulation genes. The DNA from wild-type S. meliloti is 
cloned into the broad-host-range cosmid pLAFRl, pack¬ 
aged into bacteriophage X, and introduced into E. coli. 
The clone bank is then transferred from E. coli to a Nod - 

R. meliloti strain by conjugation. Alfalfa plants are inoc¬ 
ulated with transformed Nod - S. meliloti. Those plants 
that develop root nodules have been infected with Nod + 

S. meliloti cells that presumably carry a complementing 
nodulation gene inserted into the cosmid vector. The 
transformed Nod + S. meliloti cells can be isolated directly 
from the root nodules. 
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TABLE 15.14 Some Rhizobium nodulation gene products and their probable 
functions 

. Probable function 

protein 

NodA Common; cytoplasmic membrane; with NodB, stimulates cell division 

NodB Common; cytoplasmic membrane; with NodA, stimulates cell division 

NodC Common; outer membrane; chitin synthase 

NodD Common; positive transcriptional regulator; constitutive 

NodE Cytoplasmic membrane; P-ketoacyl synthase 

NodF Cytoplasmic; acyl carrier protein 

NodG Host-specific gene; dehydrogenase 

NodH Host-specific gene; sulfotransferase 

NodlJ Common; cytoplasmic membrane; capsular polysaccharide secretion 

NodK Affects onset of nodulation in some bradyrhizobia 

NodL Cytoplasmic membrane; acetyltransferase 

NodM D-Glucosamine synthetase 

NodN Unknown 

NodO Secreted, hemolysin 

NodP With NodQ; ATP sulfurylase 

NodQ With NodP; ATP sulfurylase 

NodS Methyltransferase 

NodT Outer membrane; secretion 

NodU Unknown 

NodX Cultivar specificity 

Where biochemical or genetic evidence for the function of a particular protein is lacking, a possible function 
is assigned based on homology of the amino acid sequence to a protein of known sequence. Different rhizobial 
strains contain different subsets of these proteins. "Common" genes perform the same function in all species of 
rhizobia. 


called a nod box. This promoter element is located upstream from all the 
nodulation genes except the nodD gene, and it activates the transcription of 
these genes. 

The nodABC genes encode proteins that cause the plant root hair tips to 
swell and curl, an effect that is recognized as the initial step in the infection 
of the plant root by the bacterium. The bacteria synthesize an oligosaccha- 


TABLE 15.15 Some legumes and the nodD gene inducers that they produce 


Legume 

Compound 

Lupin (Lupinus albus) 

Erythronic acid, tetronic acid 

Alfalfa (Medicago sativa) 

Stachydrine, trigonelline, luteolin, chrysoeriol, 4,4'-dihydroxy-2'-methoxychalcone, 
liquiritigenin, 7,4'-dihydroxyflavone 

Clover (Trifolium repens) 

7,4'-Dihydroxyflavone, geraldone, 4'-hydroxy-7-methoxyflavone 

Common bean (Phaseolus vulgaris) 

Delphinidin, kaempferol, malvidin, myricetin, petunidin, quercetin, eriodictyol, 
genistein, naringenin 

Pea (Pisum sativa) 

Apigenin, eriodictyol 

Soybean (Glycine max) 

Daidzein, genistein, coumesterol 

Vetch (Vida sativa subsp. nigra) 

3,5,7,3'-Tetrahydroxy-4'-methoxyflavanone, 7,3'-dihydroxy-4'-methoxyflavanone, 
naringenin, 4,4'-dihydroxy-2'-methoxychalcone, liquiritigenin, 
7,4'-dihydroxy-3'-methoxyflavanone, 5,7,4'-trihydroxy-3'-methoxyflavanone, 
5,7,3'-trihydroxy-4'-methoxyflavanone naringenin 


The inducers are released by either roots or germinating seeds. 
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ride Nod factor that is modified by the NodH gene product and perhaps 
also by the NodQ and NodP gene products. This factor (Fig. 15.28) elicits 
in the plant a host-specific response that includes root hair curling and 
deformation and is essential for Rhizobium to induce nodules. After the 
initial change in the root morphology, the bacterium attaches to the root 
hair. Next, the bacterial cell penetrates the plant cell through an infection 
thread. Finally, depending on the bacterial strain, up to approximately 20 
additional nod gene products are synthesized. These proteins, together 
with some plant-encoded proteins, contribute to the formation of the 
nodule. 

DNA sequencing and computer analysis revealed that in a slow- 
growing variant of a Brndyrhizobium sp., the region of the DNA between the 
nodD and nodABC genes contained an open reading frame that the fast¬ 
growing form lacked. This open reading frame was designated nodK. When 
plants were inoculated with a Brndyrhizobium sp. that had a mutagenized 
nodK gene (NodK ) and were compared with those treated with the wild- 
type strain (NodK + ), the onset of nodulation in the NodK - strain-treated 
plants was 5 days earlier, the nodulation number was doubled, and there 
was a 120% enhancement of plant yield. 

To date, despite the fact that nod genes from several different rhizobial 
strains have been isolated and characterized, no simple genetic means has 
been devised for using nod genes to enable inoculated strains of Rhizobium 
to outcompete indigenous strains. Nevertheless, host specificity can be 
altered by the transfer of a nodD gene from a broad-specificity rhizobial 
strain to one with narrow specificity. 

It has become clear that the process of nodulation is quite complicated. 
Thus, considerable additional effort will be required before it is possible to 
further enhance the competitiveness of rhizobial strains by genetic engi¬ 
neering. 


FIGURE 15.28 Proposed structure of a typical Nod factor, NodRm-1. This compound 
elicits a host-specific plant response that includes root curling and deformation. 
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Modulation and Ethylene 

Ethylene is often produced by plants following the initial stages of infec¬ 
tion (eventually leading to nodule formation) by rhizobia. This small rise in 
the plant ethylene level is generally localized to a portion of the root and 
can inhibit, and therefore limit, subsequent rhizobial infection and nodula- 
tion. One way in which some strains of Rhizobium naturally increase the 
number of nodules that they can form on the roots of a host legume is to 
limit the rise in ethylene that occurs following the initial infection. Different 
Rhizobium species decrease ethylene levels either by synthesizing a small 
molecule called rhizobitoxin that chemically inhibits ACC synthase, one of 
the ethylene biosynthetic enzymes, or by producing ACC deaminase and 
removing some of the ACC before it can be converted to ethylene. The 
result of lowering the local level of ethylene is that both the number of 
nodules and the biomass of the plant are increased by 25 to 40%. Assays of 
isolated rhizobia indicate that in the field approximately 1 to 10% of rhizo¬ 
bial strains possess ACC deaminase. It should therefore be possible to 
increase the nodulation efficiency of Rhizobium strains that lack ACC 
deaminase by genetically engineering these strains with isolated Rhizobium 
ACC deaminase genes (and their regulatory regions). In fact, insertion of a 
single copy of an ACC deaminase gene from R. leguminosarum bv. viciae 
into the chromosomal DNA of a strain of S. meliloti that lacked this enzyme 
dramatically increased both the nodule numbers and biomass of host 
alfalfa plants (Fig. 15.29). While genetically engineered strains of Rhizobium 
may not be acceptable for use in the field in all jurisdictions at this time, as 
a result of this work, several commercial inoculant producers are already 
screening their more recently isolated Rhizobium strains for active ACC 
deaminase. 


FIGURE 15.29 Increased ability of S. meliloti transformed with an ACC deaminase 
gene (and its regulatory region) to nodulate alfalfa (pink bars) and to promote plant 
growth (purple bars). 
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Phytoremediation 

Scientists are currently trying to develop new and improved methods to 
deal with both inorganic and organic environmental contaminants. 
However, most of these procedures are either very expensive or not espe¬ 
cially effective. One recently developed method of environmental cleanup 
is called phytoremediation. This procedure uses plants to remove or 
sequester hazardous substances from the environment or to destroy them. 
Phytoremediation of metals and other inorganic compounds may take one 
of several forms: phytoextraction, the absorption and concentration of 
metals from the soil into the roots and shoots of the plant; rhizofiltration, 
the use of plant roots to remove metals from effluents; or phytostabiliza¬ 
tion, the use of plants to reduce the spread of metals in the environment. 
Phytoremediation of organic compounds may occur by phytostabilization, 
i.e., reducing the spread of the organic material in the environment; phyto¬ 
stimulation, the stimulation of microbial biodegradation in the rhizosphere, 
the area around the roots of plants; or phytotransformation, the absorption 
and degradation of contaminants by the plant. 

Following the testing of a large number of different plants, several 
plants that can naturally accumulate large amounts of metal have been 
identified and are being used to a limited extent for the phytoremediation 
of metals in the environment. These plants are called hyperaccumulators 
and are often found growing in soils with elevated metal concentrations. 
Unfortunately, plants that grow in the presence of very high concentrations 
of metals, even hyperaccumulating plants, are quite small. Depending 
upon the amount of metal at a particular site, it could take 15 to 20 years to 
completely remediate that site, even with hyperaccumulating plants. This 
is a time frame that is usually considered to be too long for practical appli¬ 
cation. 

A number of different types of plants, such as many common grasses, 
as well as com (maize), wheat, soybean, peas, and beans, are effective at 
stimulating the degradation of organic molecules in the rhizosphere. 
Typically, these plants all have extensive and fibrous root systems which 
form an extended rhizosphere. In addition to the biodegradation that takes 
place in the rhizosphere, several varieties of plants and trees can take up 
and degrade some organic contaminants. For example, plants with phy¬ 
totransformation activity may contain nitroreductases, which are useful for 
degrading the explosive TNT (trinitrotoluene) and other nitroaromatics; 
dehalogenases for the degradation of chlorinated solvents and pesticides; 
and laccases that can degrade anilines, such as triaminotoluene. 

Engineering Strains That Facilitate Growth 

Although using plants for remediation of persistent organic contaminants 
has advantages over other methods, many limitations exist for the large- 
scale application of this technology. For example, many plant species are 
sensitive to contaminants, so they grow slowly, and it is necessary to estab¬ 
lish sufficient biomass for meaningful soil remediation. In addition, in most 
contaminated soils, the number of microorganisms is depressed, so that 
there are not enough bacteria either to facilitate contaminant degradation 
or to support plant growth. To remedy this situation, both degradative and 
plant growth-promoting bacteria may be added to the plant rhizosphere. 
Phytoremediation (i.e., degradation of organic compounds in the presence 
of plants) alone is not much faster than bioremediation (i.e., where biodeg- 
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radation of the organics is due to the activities of microorganisms indepen¬ 
dent of plants). On the other hand, cultivating plants together with plant 
growth-promoting bacteria allows the plants to germinate to a much 
greater extent and then to accumulate a larger amount of biomass in the 
presence of an environmental contaminant (Fig. 15.30). Typically, plant 
growth-promoting bacteria alleviate a portion of the stress imposed upon 
a plant by the presence of organic contaminants, and healthier plants are 
more efficient at breaking down organic contaminants. 

In one study, plant growth-promoting bacteria that facilitate the phy¬ 
toremediation of polycyclic aromatic hydrocarbons were developed and 
tested. Polycyclic aromatic hydrocarbons in the environment are of concern 
because of their toxic, mutagenic, and carcinogenic properties. The strain 
Pseudomonas asplenii AC, isolated from polycyclic aromatic hydrocarbon- 
contaminated soil, has plant growth-promoting activity, most likely due to 
its synthesis of indoleacetic acid. This strain was engineered to be more 
efficient at reducing stress in plants by transforming it with a bacterial gene 
for the enzyme ACC deaminase (and its regulatory region). The engineered 
strain was designated P. asplenii AC-1. The ability of the wild-type and 
transformed strains, as well as the transformed strain encapsulated in an 
alginate matrix (alginate is a biodegradable carbohydrate-based polymer), 
to promote the growth of canola plants grown in the greenhouse in soil 
containing polycyclic aromatic hydrocarbons was tested. In the presence of 
high levels of polycyclic aromatic hydrocarbons, the growth of canola 
plants was dramatically reduced. When strain AC was added to canola 
seeds, plant growth improved somewhat. Moreover, addition of strain 
AC-1, either in suspension or alginate encapsulated, dramatically improved 
plant growth (Fig. 15.31). These results suggest that plant growth in the 


FIGURE 15.30 Growth of Kentucky bluegrass with (+) or without (-) plant growth- 
promoting bacteria (PGPB) in the presence of increasing amounts of polycyclic 
aromatic hydrocarbons. In every instance, the plants attain a significantly greater 
amount of biomass when plant growth-promoting bacteria are added to the seeds 
before they are planted. 
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in liquid form in liquid form in alginate 


FIGURE 15.31 Canola root (brown) and shoot (green) biomass after 25 days of growth 
in soil containing 6 g of polycyclic aromatic hydrocarbons per kg. Adapted from 
Reed and Click, Can. ]. Microbiol. 51:1061-1069, 2005. 


presence of polycyclic aromatic hydrocarbons was facilitated by both bac¬ 
terial indoleacetic acid and ACC deaminase. In addition, several factors 
may favor the alginate-encapsulated inoculant. As the alginate matrix dis¬ 
solves, the encapsulated bacteria are released steadily over time, poten¬ 
tially allowing greater bacterial colonization of the plant roots, especially in 
the presence of polycyclic aromatic hydrocarbons, which can limit bacterial 
growth and persistence. Alginate encapsulation has also been reported to 
increase plasmid stability, which is important because the ACC deaminase 
gene was introduced into strain AC-1 on a broad-host-range plasmid. 
While these results are preliminary, they indicate that this approach may be 
useful in the cleanup of contaminated field sites. 

Engineering Degradative Plasmids 

In addition to engineering bacterial strains to facilitate plant growth in the 
presence of stressful contaminants, it is also possible to develop plant 
growth-promoting bacteria that can degrade some contaminants. In one 
study, scientists engineered a plant growth-promoting strain of P. fluore- 
scens to be able to degrade 2,4-dinitrotoluene (Fig. 15.32). The compound 
2,4-dinitrotoluene, which is an intermediate in the synthesis of both poly¬ 
urethane and various explosives, is a problem pollutant. Its presence in the 
environment is widespread, and it is both toxic and carcinogenic. Several 
species of Burkholderia carry plasmids that encode enzymes that can break 
down 2,4-dinitrotoluene. However, some species of Burkholderia have been 
found to be either plant pathogens or opportunistic human pathogens, so 
that there has been reluctance to deliberately release any Burkholderia 
strains, even seemingly harmless ones, into the environment. Instead, 
using three minitransposons, all of the genes necessary for the complete 
degradation of 2,4-dinitrotoluene were introduced into the chromosomal 
DNA of a plant growth-promoting strain of P. fluorescens. In liquid culture. 
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FIGURE 15.32 Pathway for 2,4-dintrotoluene (2,4-DNT) degradation. The enzymes 
involved in this pathway include DntA, 2,4-dinitrotoluene dioxygenase; DntB, 
4-methyl-5-nitrocathecol (4M5NC) monooxygenase; DntC, 2-hydroxy-5-methylqui- 
none (2H5MQ) reductase; DntD, 2,4,5-trihydroxytoluene (2,4,5-THT) oxygenase; 
DntG, 2,4-dihydroxy-5-methyl-6-oxo-2,4-hexadienoic acid (DMOHA) isomerase/4- 
hydroxy-2-keto-5-methyl-6-oxo-3-hexenoate hydrolase; and DntE, coenzyme 
A-dependent methylmalonate semialdehyde (CoASH) dehydrogenase. 


the engineered P. fluorescens strain completely degraded toxic levels of 
2,4-dinitrotoluene at both 28 and 10°C. Following inoculation of plant 
seeds, a much greater fraction of the seedlings survived in contaminated 
soil when the engineered rather than the wild-type P. fluorescens strain was 
used (Table 15.16). 

Engineering Bacterial Endophytes 

Some bacteria that normally bind to and proliferate on the roots of plants 
(rhizosphere bacteria) contain biodegradative plasmids that encode the 
enzymes for the complete breakdown of various organic contaminants (see 
chapter 13). However, the rhizosphere (the area around plant roots) is not 
always the environment that is most conducive to the degradation of these 
compounds. This is because when bacteria are attached to the root surface 
they are affected by soil pH, temperature, water content, and chemical 
composition, as well as the presence of amoebae, fungi, and other bacteria. 
Since many plants readily take up a wide range of organic compounds, it 
might be advantageous if the contaminant-degrading bacteria were local¬ 
ized within the plant roots rather than on the root surface (bacteria that can 
proliferate within plant tissues are called endophytes). To achieve this, one 
group of workers transferred, by conjugation, a plasmid containing biodeg¬ 
radative genes encoding enzymes that degrade toluene from a bacterium 
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TABLE 15.16 Survival of plant seedlings 14 days after being planted in the 
presence of DNT 


Presence of DNT 

Added bacterial strain 

Seedling survival (%) 

- 

None 

65 

+ 

None 

4 

+ 

Wild type 

11 

+ 

Engineered 

42 


The seeds were inoculated with the indicated bacterium before being planted. absent; 
+/ present. 


that binds only to root surfaces to an endophytic bacterium that can colo¬ 
nize the interior tissue of the plant root but does not normally degrade 
toluene (Fig. 15.33). The endophyte that cannot degrade toluene, the sur¬ 
face-colonizing bacterium that can degrade toluene, and the transconju- 
gant endophyte that can degrade toluene were tested for the ability to 


FIGURE 15.33 Conjugal transfer of a plasmid carrying toluene degradation genes 
from a root surface-colonizing bacterium (yellow) to a soil endophytic bacterium 
(green). In the presence of 3-week-old lupine plants, the resultant transconjugant is 
able to completely degrade toluene that is added to the soil. 
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FIGURE 15.34 Schematic representation of a plant growth-promoting bacterium 
bound to a plant root. In the presence of large amounts of metal (such as nickel 
[Ni 2+ ]) in the environment, the plant has difficulty acquiring a sufficient amount of 
iron (Fe 3+ ) from the soil. However, the siderophore that is secreted by the bound 
bacterium has a very high affinity for iron, forming an iron-siderophore complex 
that can be taken up by the plant. Once inside the plant, the bacterial siderophore 
is cleaved, and the iron that is released is utilized in plant metabolic reactions. 


degrade toluene in the presence of 3-week-old lupine plants. When either 
no bacterium or the endophyte that could not degrade toluene was present, 
the toluene remained intact and was toxic to the plants. The surface-colo¬ 
nizing bacterium that could degrade toluene removed some of the toluene 
from the soil and allowed the plant to grow to a limited extent. On the other 
hand, the transconjugant toluene-degrading endophytic strain completely 
degraded the toluene and protected its host against toluene toxicity. Given 
the fact that there is still widespread reluctance to deliberately release 
genetically engineered bacteria into the environment in many jurisdictions, 
transconjugants carrying naturally occurring biodegradative pathways 
may be advantageous in that they are not necessarily considered to be 
genetically modified bacteria. 

Metals in the Environment 

While plants grown in metal-contaminated soils can withstand some of the 
effects of high concentrations of metals within their tissues, two features of 
most plants result in a decrease in plant growth and viability. That is, in the 
presence of high levels of metals, most plants (1) synthesize stress ethylene 
and (2) become severely iron depleted. However, plant growth-promoting 
bacteria can relieve some of the effects of metals on plants. First, ACC 
deaminase-containing plant growth-promoting bacteria decrease the level 
of stress ethylene in a plant growing in soil that contains high levels of 
metal. Second, plants can take up and utilize complexes between bacterial 
siderophores and iron. In metal-contaminated soils, plants are generally 
unable to obtain enough iron because iron uptake is inhibited by the metal 
contaminant(s). Plant siderophores bind to iron with a much lower affinity 
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FIGURE 15.35 Effect of adding the wild-type plant growth-promoting bacterium 
Kluyvera ascorbata SUD165 (WT) or a siderophore-overproducing mutant of the 
bacterium, K. ascorbata SUD165/26 (mutant), on plant dry weight and chlorophyll 
concentration in the presence of nickel (Ni). The error bars indicate standard 
errors. 


than bacterial siderophores, so plants are often unable to accumulate a suf¬ 
ficient amount of iron unless bacterial siderophores are present (Fig. 15.34). 

Some metal-resistant bacterial strains promote plant growth in the pres¬ 
ence of inhibitory levels of nickel, lead, or zinc and are therefore an effective 
adjunct to plants in phytoremediation studies. In one instance, a mutation 
that caused the overproduction of a bacterial siderophore was selected. 
When the wild-type bacterium and the siderophore-overproducing mutant 
were tested in the laboratory, the siderophore-overproducing mutant stimu¬ 
lated plant growth significantly more than the wild-type bacterium (Fig. 
15.35). When the siderophore-overproducing mutant was tested in the field 
with the plant Indian mustard (Bmssica juncea) in soil that had been con¬ 
taminated with nickel over a period of many years, both the number of 
seeds that germinated in the nickel-contaminated soil and the size that the 
plants were able to attain were increased by 50 to 100%. Overall, there was 
a two- to fourfold increase in the amount of nickel removed from the soil by 
the addition of the mutant compared with the wild-type bacterium. On the 
other hand, the presence of the mutant bacterium had no measurable influ¬ 
ence on the amount of nickel accumulated per milligram (dry weight) in 
either plant roots or shoots. Therefore, the bacterial plant growth-promoting 
effect in the presence of nickel is attributable to the increase in the amount 
of plant material and the number of plants. 

Phytoremediation is still at an early stage of development and cur¬ 
rently accounts for only a very small fraction of the total amount spent each 
year for the remediation of hazardous sites. However, the world remedia¬ 
tion market, estimated in 2001 to be $25 billion to $30 billion, is expected to 
grow to nearly $100 billion by around 2010, and it is estimated that phy¬ 
toremediation could account for up to 10% of this market. 

At present, the largest number of sites being remediated contain 
organic contaminants, because organics are easier and less expensive than 
metals to remediate. However, the removal of metal contaminants from the 
environment is expected to receive more attention in the future. 
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SUMMARY 


M any soil microorganisms have the ability to stimulate the 
growth of plants. With an eye to diminishing depen¬ 
dency on chemical fertilizers, the molecular mechanisms by 
which bacteria promote plant growth have been examined. 
Plant growth promotion may be achieved directly by the 
ability of the bacteria to fix nitrogen, sequester iron, facilitate 
phosphorus uptake, produce phytohormones that trigger 
responses in a growing plant, or enzymatically reduce poten¬ 
tially inhibitory levels of the plant hormone ethylene. Some 
soil bacteria can also stimulate plant growth indirectly by 
inhibiting the growth of phytopathogenic microorganisms. 

Recently, considerable effort has been devoted to under¬ 
standing and productively utilizing the ability of plant growth- 
promoting bacteria to facilitate plant growth by lowering 
plant ethylene levels. This occurs as a consequence of the 
action of the enzyme ACC deaminase, which breaks down 
ACC, the immediate precursor of ethylene in all higher 
plants. 

Plants often respond to a variety of different environmental 
stresses by synthesizing ethylene, which can trigger a stress/ 
senescence response in the plant. The increased level of eth¬ 
ylene synthesized in response to trauma inflicted by chemi¬ 
cals, temperature extremes, water stress, ultraviolet light, 
insect predation, disease, and mechanical wounding can be 
both the cause of some symptoms of stress and the inducer of 
responses that enhance the survival of the plant under adverse 
conditions. This seemingly paradoxical situation is explained 
by the presence of two bursts of ethylene synthesis following 
the stress. The first, small peak activates the synthesis of plant 
defense proteins, while the much larger peak of ethylene, 
which is synthesized later, can exacerbate the impact of the 
stressor. 

The enzyme ACC deaminase, when present in plant 
growth-promoting bacteria, can act to modulate the level of 
ethylene in a plant and thereby decrease the deleterious effects 
of a variety of stressors. Bacteria that contain ACC deaminase 
activity can significantly decrease the inhibition of plant 
growth that is observed in the presence of high salt levels, 
phytopathogens, flooding, or drought. These bacteria can also 
be used as an adjunct in phytoremediation (environmental 
cleanup using plants) strategies that are designed to remove 
metals or organic contaminants from the environment. 

Of the plant growth-promoting bacteria that have been 
studied in detail and are currently used in agricultural prac¬ 
tice, much of the research has focused on rhizobial bacteria. 
These organisms form a complex, obligatory symbiotic rela¬ 
tionship with specific plants. 

The molecular basis of nitrogen fixation has been examined 
extensively. Nitrogenase, the nitrogen-fixing enzyme, has 


been characterized in detail. Molecular genetic studies have 
established that bacterial nitrogen fixation is a complex pro¬ 
cess that requires seven operons, with a total of 20 different 
proteins, that are coordinately regulated. This complexity has 
so far frustrated attempts to create plants that can fix nitrogen 
and has prevented transfer of the ability to fix nitrogen to 
other bacteria. 

The amount of nitrogen that can be fixed by a rhizobial 
strain may be increased by genetic engineering of genes that 
indirectly affect nitrogen fixation. For example, nitrogen fixa¬ 
tion may be increased either by inhibiting the synthesis of 
rhizobial glycogen, by modulating the level of oxygen within 
the bacterial cell, or by preventing rhizobial synthesis of the 
polymer poly-|3-hydroxybutyrate, which normally acts as a 
carbon storage compound. 

As part of the action of nitrogenase, hydrogen gas (H 2 ) is 
generated at the expense of ATR Some Rhizobinm strains pos¬ 
sess the enzyme hydrogenase, which is able to recycle H 2 in 
vivo to H + , an activity that increases the efficiency of nitrogen 
fixation. When strains are defective in hydrogenase activity, 
the ability to fix nitrogen and promote plant growth is dimin¬ 
ished. With this in mind, hydrogenase genes have been cloned 
into strains of rhizobial bacteria that form symbiotic relation¬ 
ships with crop plants. Genetic engineering of hydrogenase 
genes can produce rhizobial strains with an enhanced ability 
to fix nitrogen. 

Part of the interaction between symbiotic rhizobial strains 
and plants is the formation of nodules on the roots of plants 
that are the sites of bacterial nitrogen fixation. It has been rea¬ 
soned that enhancing nodulation by genetic engineering will 
enable inoculated rhizobial strains to be more effective com¬ 
petitors for sites on the roots of target plants than indigenous 
strains. However, studies to date have shown that the genetic 
basis of nodulation is complex, involving a number of dif¬ 
ferent genes, so that at present there is no simple way to 
manipulate this process genetically. 

The indirect promotion of plant growth occurs when plant 
growth-promoting bacteria decrease or prevent the damage 
that is caused by either fungal or bacterial phytopathogens. 
The bacteria that act in this way are referred to as biocontrol 
bacterial strains. Some of the substances produced by biocon¬ 
trol bacteria, such as siderophores, antibiotics, other small 
molecules, and various enzymes that can lyse fungal cell 
walls, help to limit the damage to plants by phytopathogens. 
The activity of biocontrol bacteria may be augmented by engi¬ 
neering these strains to be better root colonizers, more effi¬ 
cient at lowering plant ethylene levels, and more active 
producers of antibiotics or enzymes. 
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REVIEW QUESTIONS 

1. Starting with a strain of Bradyrhizobium japonicum that can 
fix nitrogen and form a symbiotic relationship with soybean 
roots, and assuming that you do not have a DNA hybridiza¬ 
tion probe for nod genes, outline a scheme that you would use 
for isolating the cluster of nodulation genes from this 
organism. 

2. How does glycogen synthesis affect the ability of a strain of 
Rhizobium to fix nitrogen? 

3. How would you engineer a rhizobial strain to have a lower 
internal level of free oxygen? How might this affect the bacte¬ 
rium in the free-living state? As a bacteroid? 

4. What is hydrogenase? How could it be used to improve the 
yield of alfalfa? 

5. Suggest a strategy that you might employ to isolate all of 
the genes involved in nitrogen fixation from Azotobacter 
vinelandii, assuming that you do not have nif genes from other 
microorganisms to use as hybridization probes. 

6. What might be the consequences of mutagenizing either 
nifA or nifL with respect to the amount of nitrogen that an 
organism can fix? 

7. Discuss whether it is possible to genetically engineer plants 
to fix nitrogen. 

8. What are siderophores? How could genetic manipulation 
of siderophore genes enable bacteria to enhance plant 
growth? 


9. Suggest a scheme for isolating siderophore biosynthesis 
genes. 

10. What are the advantages of microbial fertilizers over 
chemical fertilizers? 

11. How do ACC deaminase-containing plant growth-pro¬ 
moting bacteria facilitate plant growth? 

12. What is phytoremediation? How do plant growth-pro¬ 
moting bacteria affect phytoremediation? 

13. How can A. radiobacter be engineered to be a more effec¬ 
tive biocontrol agent? 

14. Which enzymes secreted by plant growth-promoting bac¬ 
teria contribute to their ability to act as biocontrol agents? 
How do these enzymes contribute to biocontrol? 

15. How does poly-p-hydroxybutyrate affect the ability of a 
strain of Rhizobium to fix nitrogen? 

16. How might endophytic bacteria be useful as part of a phy¬ 
toremediation strategy? 

17. Assuming that you do not have a DNA hybridization 
probe available, how would you isolate a bacterial antifreeze 
protein gene? 

18. What strategies can be employed to increase the effective¬ 
ness of biocontrol bacterial strains? 

19. What mechanisms do free-living plant growth-promoting 
bacteria use to facilitate plant growth? 


Microbial Insecticides 
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O F all classes OF organisms, insects have the largest number of 
described species (more than 750,000). Insects negatively affect 
humans in a variety of ways: they cause massive crop damage, and 
they act as vectors of both human and animal diseases. During the 1940s, a 
number of chemical insecticides were developed as a means of controlling 
the proliferation of noxious insect populations. One of these was the chlo¬ 
rinated hydrocarbon DDT (dichlorodiphenyltrichloroethane), which had 
originally been synthesized in the 1870s but was not recognized as an insec¬ 
ticide until the late 1930s. DDT proved to be exceptionally effective in 
killing and controlling many species of pests. Chlorinated hydrocarbons 
such as DDT function by attacking the nervous system and muscle tissue 
of insects. Later, other chlorinated hydrocarbons, such as dieldrin, aldrin, 
chlordane, lindane, and toxophene, were synthesized and applied on a 
massive scale against crop pests and insects that carry infectious agents. 

Organophosphates, another class of chemical insecticides that includes 
malathion, parathion, and diazinon, were initially developed as chemical 
warfare agents. Now they are used to control insect populations by inhib¬ 
iting the enzyme acetylcholinesterase, which hydrolyzes the nerve trans¬ 
mitter acetylcholine, thereby disrupting the functioning of motor neurons 
and neurons in the brain of the insect. 

By the early 1960s, over 100 million acres of U.S. agricultural land was 
being treated annually with chemical insecticides. At about that time, 
researchers realized that chlorinated hydrocarbon insecticides, to a large 
extent, and organophosphate insecticides, to a lesser extent, had dramatic 
and immediate side effects and long-term and indirect effects on animals, 
ecosystems, and humans. Chlorinated hydrocarbons, exemplified by DDT, 
were found to persist in the environment for more than 20 years and to 
accumulate in increasing concentrations through food chains. This bioac¬ 
cumulation in fatty tissues had a significant biological impact on many 
organisms. For example, in North America, many species of birds, including 
peregrine falcons, sparrow hawks, bald eagles, brown pelicans, and 
double-crested cormorants, underwent severe population declines. 
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During the 1950s, as the targeted insect pest populations became 
increasingly resistant to treatment with many chemical insecticides, higher 
concentrations of the insecticides were applied to control the pests. In addi¬ 
tion, chemical insecticides were found to lack specificity; consequently, 
beneficial insects were killed along with those that were considered to be 
pests. In fact, in some instances, the natural enemies of the insect pest spe¬ 
cies were killed off more efficiently than the target organisms, with the 
bizarre result that pesticide treatment led to greater numbers of the insects. 

Given all the drawbacks associated with the use of chemical insecti¬ 
cides, alternative means of controlling harmful insects have been sought. 
Using insecticides that are produced naturally by either microorganisms or 
plants was an obvious choice. On the positive side, these compounds are 
usually highly specific for a target insect species, biodegradable, and slow 
to select for resistance. But, on the negative side, their low potency and 
high cost of production limit their use for a variety of applications. 
Recombinant DNA technology provides an opportunity to overcome many 
of these negative attributes. In particular, the insecticidal activities of the 
bacterium Bacillus thuringiensis and insect baculovirus systems have been 
developed into safe, specific, and effective insecticides. 

The worldwide market for pesticides is enormous: currently more than 
$30 billion per year and growing rapidly. Although biopesticides, mostly B. 
thuringiensis, make up only about 1% of this total, much of the expected 
growth in this field is likely to involve biopesticides. 


Insecticidal Toxin of 8. thuringiensis 

Mode of Action and Use 

A microbial insecticide can be a microbially produced toxic substance that 
kills an insect species or an organism that has the ability to fatally infect a 
specific target insect. The most studied, most effective, and most often uti¬ 
lized microbial insecticides are the toxins synthesized by B. thuringiensis. 
This bacterium comprises a large number of strains and subspecies, each of 
which produces a different toxin that can kill specific insects—there are 
more than 150 different subspecies of B. thuringiensis (Table 16.1). For 


TABLE 16.1 Some properties of the insecticidal toxins from various strains of 
B. thuringiensis 


B. thuringiensis strain 
or subspecies 

Protoxin size 
(kDa) 

Target insects 

Serotype 

berliner 

130-140 

Lepidoptera 

1 

kurstaki KTO, HD-1 

130-140 

Lepidoptera 

3 

entomocidus 6.01 

130-140 

Lepidoptera 

6 

aizawai 7.29 

130-140 

Lepidoptera 

7 

aizawai IC 1 

135 

Lepidoptera, Diptera 

7 

kurstaki HD-1 

71 

Lepidoptera, Diptera 

3 

tenebrionis (san diego) 

66-73 

Coleoptera 

8 

morrisoni PG14 

125-145 

Diptera 

8 

israelensis 

68 

Diptera 

14 


Adapted from Lereclus et al., p. 37-69, in Entwistle et al. (ed.). Bacillus thuringiensis, an Enviromental 
Biopesticide: Theory and Practice (John Wiley & Sons, Chichester, United Kingdom, 1993). 
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example, B. thuringiensis subsp. kurstaki is toxic to lepidopteran larvae, 
including those of moths, butterflies, and skippers; cabbage worms; and 
spruce budworms. B. thuringiensis subsp. israelensis kills diptera, such as 
mosquitoes and blackflies. B. thuringiensis subsp. tenebrionis (also known as 
B. thuringiensis subsp. san diego) is effective against coleoptera (beetles), 
such as the potato beetle and the boll weevil. In addition, some subspecies 
of B. thuringiensis produce insecticidal toxins that are directed against 
hymenoptera (sawflies, wasps, bees, and ants), orthoptera (grasshoppers, 
crickets, and locusts), and mallophaga (lice). 

The insecticidal activity (toxin) of B. thuringiensis subsp. kurstaki (first 
discovered in 1911) and other strains is contained within a very large struc¬ 
ture called a parasporal crystal, which is synthesized during bacterial spo- 
rulation. Although no significant role on behalf of the bacterium has been 
attributed to the parasporal crystal structure, by synthesizing the crystal, 
the bacterium is "providing for its future" in that a dead insect provides 
sufficient nutrients to allow germination of the dormant spore. The 
parasporal crystal contains approximately 20 to 30% of the dry weight of a 
sporulated culture and usually consists mainly of protein (-95%) and a 
small amount of carbohydrate (-5%). About 150 different parasporal 
crystal proteins (Cry proteins) are known. The crystal is an aggregate of 
protein that can generally be dissociated by mild alkali treatment into sub¬ 
units. The subunits can be further dissociated in vitro by treatment with 
(3-mercaptoethanol, which reduces disulfide linkages (Fig. 16.1). 

The insecticidal toxins from the B. thuringiensis strains were previously 
grouped into four major classes—Cryl, Cryll, Crylll, and CryIV—based on 
the insecticidal activity of the toxin. These proteins were further organized 
into subclasses (A, B, C, etc.) and subgroups (a, b, c, etc.). In the past few 
years, as increasing numbers of B. thuringiensis strains were isolated and 
their genes were characterized, it became clear that the original classifica¬ 
tion was unable to accommodate many of the newly discovered B. thur¬ 
ingiensis toxin genes. Therefore, a new system of B. thuringiensis gene 
classification was introduced. 

In the current classification scheme (established in 1998), B. thuringi¬ 
ensis insecticidal (Cry) proteins are assigned designations based on their 
degree of evolutionary divergence, as estimated by certain mathematical 
algorithms. This scheme is readily visualized by constructing a phyloge¬ 
netic tree based on the amino acid sequences of B. thuringiensis toxin pro¬ 
teins, i.e.. Cry proteins (Fig. 16.2). Basically, the amino acid sequences of the 
proteins are compared, and if the proteins are identical, then they are 100% 
homologous. If only 50% of the amino acids are the same, then the proteins 
have 50% identity. The relationship among a set of protein sequences can 
be deduced and represented as a branched tree. The nodes (branch points) 
of the tree represent points of divergence. For the classification of the B. 
thuringiensis Cry proteins, a four-part naming system was devised. 
Demarcations, set at 95, 78, and 45% homology, show the boundaries that 
define the different nomenclature ranks. The name that is given to a par¬ 
ticular toxin depends on the location of the node where the toxin protein 
sequence enters the tree relative to these set boundaries. A toxin that joins 
the tree to the left of the leftmost boundary is assigned a new primary rank 
(an Arabic numeral), one that joins the tree between the central and left 
boundaries is assigned a new secondary rank (an uppercase letter), one 
that joins the tree between the central and the right boundaries is assigned 
a new tertiary rank (a lowercase letter), and one that joins to the right of the 
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250-kDa subunit 



protoxin 


Reduce with 
mercap toethanol 



Treat with 
protease 



FIGURE 16.1 Schematic representation of a B. thuringiensis parasporal crystal com¬ 
posed of Cryl protoxin protein. Each 250-kDa protein subunit of the parasporal 
crystal contains two 130-kDa polypeptides. (Molecular masses determined by poly¬ 
acrylamide gel electrophoresis are approximations and do not always provide exact 
multiples.) Conversion of the 130-kDa protoxin into an active 68-kDa toxin requires 
the combination of a slightly alkaline pH (7.5 to 8) and the action of a specific 
protease(s), both of which are found in the insect gut. The activated toxin binds to 
protein receptors on the surface of the gut epithelial cell membrane. 
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Primary rank Secondary rank Tertiary rank 



% Amino acid sequence identity 


FIGURE 16.2 Schematic representation of a portion of the phylogenetic tree of B. 
thuringiensis insecticidal (Cry) proteins. The different background colors delineate 
the different levels of nomenclature ranks. Cryl and Cry7 share less than 45% iden¬ 
tity, CrylA and CrylF are between 45% and 78% identical, and Cryl Ab and CrylAe 
are between 78% and 95% identical. Adapted from Crickmore et al., Microbiol. Mol. 
Biol. Rev. 62:807-813,1998. 


rightmost boundary is assigned a new quaternary rank (an Arabic 
numeral). For example. Cry proteins that are less than 45% homologous are 
given a number (e.g., Cryl and Cry7) and are assigned to the primary rank. 
Cry proteins that are 45 to 78% identical to proteins of the primary rank are 
further designated with an uppercase letter (e.g., CrylA and CrylF). The 
complete Cry protein tree consists of the positions of all Cry proteins. This 
classification system is utilized throughout this chapter, even when refer¬ 
ring to work that was published prior to the development of this system. 

The parasporal crystal does not usually contain the active form of the 
insecticide. Rather, once the crystal has been solubilized, the protein that is 
released is generally a protoxin, a precursor of the active toxin. The pro¬ 
toxin of many of the Cry toxins that are directed against lepidoptera has a 
molecular mass of approximately 130 kilodaltons (kDa) (Fig. 16.1). When a 
parasporal crystal is ingested by a target insect, the protoxin is activated 
within its gut by the combination of alkaline pH (7.5 to 8.0) and specific 
digestive proteases, which convert the protoxin into an active toxin with a 
molecular mass of approximately 68 kDa (Fig. 16.1). In its active form, the 
toxic protein inserts itself into the membranes of the gut epithelial cells of 
the insect and creates an ion channel, which leads to an excessive loss of 
cellular ATP (Fig. 16.3). About 15 minutes after this ion channel forms, cel¬ 
lular metabolism ceases; the insect stops feeding within a few hours, 
becomes dehydrated, and eventually dies (in about 2 to 5 days). Because 
the conversion of the protoxin to the active toxin requires both alkaline pH 
and the presence of specific proteases, it is extremely unlikely that non¬ 
target species, such as humans and farm animals, will be affected. 

The mode of action of B. thuringiensis toxins imposes certain constraints 
on its application. To kill an insect pest, B. thuringiensis parasporal crystals 
must be ingested. Contact of the bacterium or the insecticidal toxin with the 
surface of the target organism has no effect on it. The requirement that the 
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FIGURE 16.3 Insertion of the B. thuringiensis toxin into the membrane of an insect gut 
epithelial cell. The toxin forms an ion channel between the cell cytoplasm and the 
external environment. 


insecticide be ingested, in part, limits the susceptibility of nontarget insects 
and other animals to the insecticide. B. thuringiensis is generally applied by 
spraying, so it is usually formulated with insect attractants to increase the 
probability that the target insect will ingest the toxin. However, insects that 
bore into plants or attack plant roots are less likely to ingest a B. thuringiensis 
toxin that has been sprayed on a host plant, so other strategies have been 
devised to control such pests. One approach is to create transgenic plants 
that carry and express a B. thuringiensis toxin gene so that they are protected 
from infestation throughout the growing season (see chapter 18). 

It was recently discovered for gypsy moths (and suggested to possibly 
be the case for other insects, as well) that the B. thuringiensis toxin does not 
kill the larvae by itself as previously thought. Rather, bacteria that are part 
of the insect's gut microbial community are required for toxicity to the 
insect. Elimination of the insect's gut bacteria by oral administration of 
antibiotics abolished B. thuringiensis insecticidal activity, and reintroduc¬ 
tion of an Enterobacter sp. that is normally part of the insect's gut microbial 
community restored this activity. The data indicate that the B. thuringiensis 
toxin enables the enteric bacteria to reach the insect hemocoel by permea- 
bilizing the gut epithelium. In this way, the insect is killed much more 
rapidly than might otherwise be expected. The discovery that B. thuringi¬ 
ensis insecticidal activity depends on insect enteric bacteria should not have 
any significant effect on the efficacy or use of B. thuringiensis-based insecti¬ 
cides. However, this information may be important in the design and exe¬ 
cution of some laboratory experiments intended to better understand the 
functioning of B. thuringiensis insecticidal strains and to facilitate the devel¬ 
opment of improved biological insecticides. 

A limiting feature of the action of the B. thuringiensis toxin is that it can 
kill a susceptible insect only during a specific developmental stage. 
Therefore, the toxin must be applied when the pest population is at a par¬ 
ticular stage in its life cycle (generally the larval stage). The other major 
impediment to more widespread application of B. thuringiensis subsp. 
kurstnki is that it costs from 1.5 to 3 times as much as chemical insecticides. 
The limitations and the cost notwithstanding, several subspecies of B. thu¬ 
ringiensis have been approved for use and have rapidly gained widespread 
acceptance (Table 16.2). 

B. thuringiensis subsp. kurstnki was first discovered in 1901, although its 
commercial potential was largely ignored until 1951. Within recent decades, 
however, B. thuringiensis subsp. kurstnki has become the major means of 
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TABLE 16.2 Some subspecies of B. thuringiensis that have been approved for use in 
the field and some of their targets 


B. thuringiensis 
subspecies 

Targets 

aizawai 

Fruits and nuts, berries, peppers, tomato, root crops, tobacco, 
beans, corn, cotton, cabbage, eggplant, melons, cucumber, 
cauliflower, broccoli, ornamentals 

knrstaki 

Berries, fruits, nuts, melons, cucumber, squash, eggplant, 
tomato, broccoli, cabbage, kale, mustard, parsley, spinach, 
turnip, lettuce, stored grain, stored crop seed, ornamentals, 
cotton, celery, peanut, sugar beet, tobacco, avocado, onion, 
carrot, forestry products, grape, canola, sorghum, wheat, 
forage crops, corn, sunflower, root crops, cranberry 

israelensis 

Mosquito breeding habitat, including rice fields, ponds, pas¬ 
tures, ditches, salt marshes, tidal water, sewage lagoons, 
lakes; ornamental and nursery plants; mushrooms 
(Agaricus bisporus) 

tenebrionis 

Eggplant, tomato, potato, ornamentals 


controlling the spruce budworm in Canada. In 1979, approximately 1% of 
the forest area in Canada that was treated with an insecticide to combat the 
spruce budworm (about 2 million hectares, or 8,000 square miles) was 
sprayed with B. thuringiensis subsp. knrstaki. The remainder of the treated 
forests were sprayed with chemical insecticides. By 1986, the use of B. thu¬ 
ringiensis subsp. knrstaki had increased dramatically. It was used to treat 
approximately 74% of the forests sprayed in that year for spruce budworm. 
In other countries, B. thuringiensis subsp. knrstaki has been used against tent 
caterpillars, gypsy moths, cabbage worms, cabbage loopers, and tobacco 
hornworms. 

For the biological control (biocontrol) of insect pests, B. thuringiensis 
subsp. knrstaki is typically applied by spraying approximately 1.3 x 10 8 to 
2.6 x 10 s spores per square foot (1 square foot is equivalent to 0.093 m 2 ) of 
the target area. Administration of the spores is timed to coincide with the 
peak of the larval population of the target organism, because the parasporal 
crystals, being sensitive to sunlight, are short-lived in the environment. 
Under simulated conditions, sunlight degrades over 60% of the tryptophan 
residues of the parasporal crystal within a 24-hour period, thereby ren¬ 
dering the protein inactive. Depending on the amount of sunlight present, 
parasporal crystals may persist in the environment for as little as a day or 
as long as a month. The lack of persistence of the insecticidal protoxin in 
the natural environment means that natural selection of resistant insects is 
highly unlikely. 

Toxin Gene Isolation 

To develop B. th 11 ringiensis-ba sed insecticides that have greater potencies 
and broader host ranges, it is necessary to isolate and characterize the pro¬ 
toxin gene(s). For the initial isolation of insecticidal protoxin genes, the first 
step was to determine whether the toxin genes are located on a plasmid or 
on the chromosomal DNA. To test for plasmid-bome toxin genes, the source 
B. thuringiensis strain was conjugated with a strain that lacks insecticidal 
activity. If the latter strain acquired the ability to synthesize the insecticidal 
toxin, then the toxin gene(s) was most likely present on a plasmid, because 
the transfer of chromosomal DNA during conjugation is a rare event. 
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The procedure for isolating a protoxin-encoding DNA sequence is a 
familiar one. B. thuringiensis cells are grown in laboratory culture and 
lysed. The total cellular DNA is isolated and separated into plasmid and 
chromosomal DNA fractions by cesium chloride (CsCl) gradient centrifu¬ 
gation. When the protoxin gene is part of the genome, a clone bank is con¬ 
structed from the chromosomal DNA. When the toxin gene(s) is plasmid 
encoded, the plasmid DNA can be further fractionated by sucrose gradient 
centrifugation, which separates different plasmids according to their sizes 
and enriches for the DNA that serves as the starting material for the isola¬ 
tion of a protoxin gene(s) (Fig. 16.4). 

B. thuringiensis subsp. kurstaki contains an insecticidal protoxin gene 
on one of seven different plasmids that are approximately 2.0, 7.4, 7.8, 8.2, 
14.4, 45, and 71 kilobase pairs (kb) in length. To determine which B. thur¬ 
ingiensis subsp. kurstaki plasmid carries the protoxin gene, following 
sucrose gradient centrifugation, the plasmid DNA sample is divided into 
three fractions that contain, respectively, the small (2.0-kb), medium-sized 


FIGURE 16.4 Procedure for the isolation and partial enrichment of plasmid DNA 
fractions from a microorganism with a number of different plasmids, one of which 
encodes an insecticidal protoxin. OD 260 , optical density at 260 nm. 
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(7.4-, 7.8-, 8.2-, and 14.4-kb), and large (45- and 71-kb) plasmids. The frac¬ 
tion with the small plasmid is discarded, because the plasmid is too small 
to encode a protein equivalent to the 130-kDa protoxin. A protein of this 
size requires at least 4.0 kb of coding DNA. The medium and large plasmid 
fractions are each partially digested with the restriction enzyme Sau3AI 
and then ligated into the BamHI site of plasmid pBR322. In the original 
experiments, these clone banks were transformed into Escherichia coli, and 
then the colonies were screened immunologically (see chapter 3) by the 
following procedure to detect clones that expressed a Cry protein and 
therefore carried a cry gene. 

1. Colonies were transferred from agar plates to a nitrocellulose 
membrane. 

2. The transferred colonies were lysed with organic solvents. 

3. All available sites on the membrane to which primary and sec¬ 
ondary antibodies could potentially bind (nonspecifically) were 
blocked by treating the membrane with bovine serum albumin 
(which bound to the nonspecific sites and prevented antibodies 
from binding to those sites). 

4. The bovine serum albumin-treated membranes were treated with 
rabbit antiserum that contained antibodies against the insecticidal 
toxin. The antibodies bound only to the insecticidal toxin and not 
to any nonspecific sites on the membrane. 

5. The membranes were washed to remove unbound antibodies and 
then treated with 125 I-labeled Staphylococcus aureus protein A, which 
bound only to the Fc portion of the bound antibodies and not to 
any nonspecific sites on the membrane. 

6. Spots on the membrane corresponding to colonies that actively syn¬ 
thesized the insecticidal toxin were visualized by autoradiography. 

The isolated protoxin gene was then used as a DNA hybridization 
probe to localize the cry gene to the 71-kb plasmid of B. thuringiensis subsp. 
kurstaki. Similar cloning and screening procedures have been used to iso¬ 
late other B. thuringiensis toxin genes. However, given the current knowl¬ 
edge regarding sequence similarity among B. thuringiensis protoxin genes, 
the cloning and screening of these genes are more easily achieved by using 
polymerase chain reaction (PCR) and DNA hybridization techniques. 


Engineering of 8. thuringiensis Toxin Genes 

Once the isolation and sequencing of a toxin gene were accomplished, the 
complete amino acid sequence was determined. Comparisons of amino 
acid sequences from other B. thuringiensis toxin proteins showed that a 
common toxic domain exists in these strains. Moreover, a subcloned seg¬ 
ment of the complete protein-coding sequence produced a truncated pro¬ 
tein that retained full insecticidal activity. Thus, an intact protoxin gene, a 
portion of one, or a chemically synthesized coding sequence can be used 
for further genetic manipulation. 

Synthesis during Vegetative Growth 

Under normal conditions, most B. thuringiensis protoxin proteins are syn¬ 
thesized only during the sporulation phase of growth. In other words, only 
a portion of the growth cycle of the organism is devoted to parasporal 
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crystal production. It might therefore be advantageous, in terms of 
increased yield and decreased production time, to have the toxin gene tran¬ 
scribed and translated during vegetative growth. Furthermore, production 
of the insecticidal toxin during vegetative growth would permit the toxin 
to be synthesized by a continuous fermentation process, potentially sig¬ 
nificantly decreasing the cost of producing it. Continuous fermentations 
are carried out with smaller-scale—and therefore less expensive—bioreac¬ 
tors and downstream processing equipment than conventional batch fer¬ 
mentations. (See chapter 17 for additional details.) 

During the sporulation of B. thuringiensis, a specific transcription initia¬ 
tion factor (sigma factor) interacts with the promoters of genes that are 
active only within this phase of the bacterial life cycle. This factor turns on 
the transcription of the messenger RNAs (mRNAs) that are unique to spo¬ 
rulation. In fact, when a B. thuringiensis toxin gene with its sporulation- 
specific promoter was cloned and expressed in Bacillus subtilis, Bacillus 
megaterium, or B. thuringiensis, gene transcription occurred only during 


FIGURE 16.5 Procedure for subcloning the B. thuringiensis subsp. kurstaki insecticidal 
toxin gene so that it is expressed constitutively under the control of the promoter of 
the tetracycline resistance (Tet r ) gene (p tet ). The isolated B. thuringiensis toxin gene is 
removed from its promoter by digestion of the isolated DNA fragment with restric¬ 
tion enzymes RE1 and RE2. It is spliced by T4 DNA ligase into the plasmid vector 
downstream from p tel in place of the tetracycline resistance gene, which has been 
removed by digestion with restriction enzymes RE1 and RE2. 
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sporulation. Thus, to express a B. thuringiensis insecticidal toxin during 
vegetative growth, it is necessary to place the toxin-producing gene(s) 
under the control of a promoter that is active during vegetative growth. 

When a DNA fragment containing a toxin gene that lacked its native 
promoter was cloned into a plasmid under the control of a continuously 
active, constitutive promoter from a tetracycline resistance gene that had 
been originally isolated from a Bacillus cereus plasmid and reintroduced 
into B. thuringiensis, active toxin protein was produced continuously 
throughout the growth cycle, including both the vegetative and sporula¬ 
tion phases (Fig. 16.5). In addition, when the construct was used to trans¬ 
form a sporulation-defective mutant of B. thuringiensis, toxin synthesis 
occurred in the absence of sporulation. Under these conditions, toxin syn¬ 
thesis is more efficient than in wild-type cells, i.e., the final yield of protein 
is greater in the transformed cells, and less time and substrate are required 
to produce the toxin. A refinement of this system might entail integration 
of this vegetatively expressed toxin gene into the chromosomal DNA of the 
sporulation-defective B. thuringiensis host. This manipulation would ensure 
that the insecticidal toxin gene is not lost because of plasmid instability 
during a continuous fermentation process. 


FIGURE 16.6 Construction of a strain of B. thuringiensis with greater potency and UV 
resistance. The C-terminal third of the crylAb gene was spliced together with the 
N-terminal two-thirds of the crylC gene, all under the control of the cry3Aa pro¬ 
moter (p cn t Ma ), and then integrated into the chromosomal DNA of a sporulation- 
minus strain of B. thuringiensis. Adapted from Sanchis et al., Appl. Environ. Microbiol. 
65:4032^039, 1999. 
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Unlike that of most other B. thuringiensis toxin protein (cry) genes, the 
expression of cry3A is normally controlled by a vegetative promoter, rather 
than by a sporulation-specific promoter. The cry3A gene encodes a toxin 
that is directed against coleopteran larvae. When a mutant strain of B. thu¬ 
ringiensis that was unable to form spores was transformed with a plasmid 
carrying a cloned cry3A gene, the insecticidal toxin was both overproduced 
and stabilized in comparison to when this protein was produced in the 
wild-type strain. This result suggests that other cry genes that are normally 
expressed only during sporulation could be placed under the control of the 
cry3A promoter and overproduced by expressing these constructs in a 
sporulation-defective B. thuringiensis mutant. 

In one experiment, a chimeric crylC-crylAb gene was constructed, 
placed under the transcriptional control of the vegetative cry3A promoter, 
and then integrated into the chromosomal DNA of a nonsporulating deriva¬ 
tive of B. thuringiensis subsp. kurstaki (Fig. 16.6). The chimeric crylC-crylAb 
gene consisted of approximately 2.2 kb of DNA from the crylC gene and 1.3 
kb of DNA from the crylAb gene. Although the mature toxin that is pro¬ 
duced following proteolytic cleavage of the hybrid protoxin is identical to 
the toxin that is produced from the cry 1C gene, this toxin was found to be 
considerably more active than CrylC (Table 16.3). Thus, depending upon 
the insect tested, CrylC-CrylAb was 3 to 34 times more active than CrylC. 
This seemingly strange result probably occurs because of the increased sta¬ 
bility to proteolytic digestion of the CrylAb portion of the hybrid protoxin 
protein, which is removed upon activation of the protoxin. The nonsporu¬ 
lating B. thuringiensis host strain had a disrupted sigK gene, which encodes 
the sigma factor a 28 , which is required for sporulation-specific transcription. 
Other workers have created nonsporulating B. thuringiensis strains by 
inserting modified protoxin genes into the late-stage sporulation gene 
spoV m . Since the chimeric protoxin CrylC-CrylAb was encapsulated with 
the bacterial cells, the protein was considerably more resistant to the degra- 
dative effect of ultraviolet (UV) light, which rapidly inactivates the protoxin 
that is normally secreted outside of the bacterial cell during sporulation. In 
addition to increased potency and greater UV resistance, the environmental 
persistence of the nonsporulating mutant was significantly decreased com¬ 
pared with that of the sporulating wild-type strain. This may actually be an 
advantage, since it is less likely that the nonsporulating mutant will transfer 
any of its DNA to other organisms in the environment. 

Broadening the Spectrum of Target Insects 

Because many crops are attacked by more than one insect species, it would 
be advantageous, if feasible, to create microbial insecticides that are effective 


TABLE 16.3 Activities of CrylC and a chimeric CrylC-CrylAb toxin against three 


insect species 

Insect species 

LC 50 of CrylC (ng) 

LC 50 of CrylC-CrylAb (ng) 

Spodoptera littoralis 

378 

103 

Plutella xylostella 

174 

4.6 

Ostrinia nubilalis 

3,200 

822 


The LC 50 values reflect the amount of insecticidal toxin (in nanograms) required to kill half of the insect 
population being tested under a defined set of conditions. The smaller the LC 50 , the more potent the 
toxin. 
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kurstaki israelensis aizawai tenebrionis 



FIGURE 16.7 Naturally occurring and transformed subspecies of B. thuringiensis. The 
oval shape represents a bacterium, while the circle represents an insecticide¬ 
encoding plasmid. The plasmids are colored the same as the bacterium in which the 
toxin gene originated. 


against a broad spectrum of target insects. Such a broad-specificity molecule 
could be obtained (1) by transferring the gene for a particular toxin, e.g., one 
against diptera, into a B. thuringiensis strain that normally synthesizes a dif¬ 
ferent species-specific toxin, e.g., one against coleoptera; (2) by fusing por¬ 
tions of two different species-specific toxin genes to one another so that a 
unique dual-acting toxin (hybrid toxin) is produced; or (3) by modifying the 
portion of the insecticidal toxin that is responsible for binding to insect gut 
epithelial cell receptors. 

Transferring cry genes. To test whether the spectrum of target insect pests 
could be widened, the insecticidal toxin genes from B. thuringiensis subsp. 
aizawai and tenebrionis were cloned into shuttle vectors that could be main¬ 
tained in both B. thuringiensis and E. coli. These genetic constructs were 
then introduced by electroporation into B. thuringiensis subsp. kurstaki, 
israelensis, and tenebrionis (Fig. 16.7), and all the transformed strains were 
tested for toxicity to three different insect species. 

In each case, the toxicity of the native host toxin protein(s) was main¬ 
tained, and in most cases, the introduced toxin gene also expressed an active 
toxin with the same specificity as the toxin produced by the source bacte¬ 
rium (Table 16.4). In addition, and surprisingly, when the B. thuringiensis 
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subsp. tenebrionis toxin gene was introduced into B. thuringiensis subsp. 
israelensis, the resultant transformant was somewhat toxic to Pieris bmssicae, 
the cabbage white butterfly, against which neither of the gene products 
alone has insecticidal activity. 

In many instances, introduced plasmid vectors carrying isolated cry 
genes are unstable in B. thuringiensis. Often, in the absence of selective pres¬ 
sure, all or a portion of these plasmids are lost. The problem of plasmid 
instability with introduced genes was overcome by integrating cloned cry 
genes into the chromosomal DNA of the host cell. One group of researchers 
attempted to broaden the insect specificity of a strain of B. thuringiensis 
subsp. kurstaki, which normally carries five different insecticidal toxin 
genes, crylAa, crylAb, crylAc, crylAa, and crylAb. While the products of 
these cry genes are active against a variety of lepidopteran species, they are 
not effective against Spodoptem spp. Therefore, a crylCa gene, which is nor¬ 
mally found only in B. thuringiensis subsp. aizawai and entomocidus, was 
introduced into the chromosomal DNA of the B. thuringiensis subsp. 
kurstaki host strain. The transformed B. thuringiensis subsp. kurstaki strain 
showed a sixfold increase in its ability to kill Spodoptera exigua (beet army- 
worm) larvae. 

Modifying the loop regions of domain II. The toxic moiety of many Cry 
proteins is composed of three separate domains. Domain II is involved in 
the specific binding of the toxin to protein receptors that are found on the 
surfaces of insect midgut epithelial cells, although domain III may also 
play a role in receptor binding. Following binding, a portion of domain I, 
in the N-terminal region of the toxin, inserts into the membrane. It is 
believed that the interaction of portions of domain I from several toxin 
molecules interact to make up the pore. Domain III, which is located at the 
C-terminal end of the toxin molecule, is also thought to be involved in pore 
function. 

Modification of cry genes to increase the binding of the Cry protein to 
receptors generally leads to an increase in insecticidal activity. In particular. 


TABLE 16.4 Toxicities of naturally occurring and transformed subspecies of 
B. thuringiensis against the insects Pieris bmssicae (cabbage white butterfly), 
Aedes aegypti (mosquito), and Phaedon cochleariae (beetle) 


Source of toxin Toxicity to: 


Host DNA 

Introduced DNA 

Pieris 

Aedes 

Phaedon 

aizawai 

None 

++ 

+ 

- 

israelensis 

None 

- 

++ 

- 

israelensis 

aizawai 

++ 

++ 

- 

israelensis 

tenebrionis 

+ 

++ 

++ 

kurstaki 

None 

++ 

+ 

- 

kurstaki 

tenebrionis 

++ 

+ 

++ 

tenebrionis 

None 

- 

- 

++ 

tenebrionis 

aizawai 

++ 

+ 

+ 


Adapted from Crickmore et al., Biochem. J. 270:133-136,1990. 

In these experiments, the toxicity was graded as follows: ++, 0 to 5% of the leaf was consumed (for 
Phaedon and Pieris ) or 100% mortality occurred within 1 hour (Aedes); +, 5 to 50% of the leaf was consumed 
(Phaedon and Pieris ) or 50 to 100% mortality occurred within 24 hours (Aedes); -, 50% of the leaf was con¬ 
sumed (Phaedon and Pieris) or no mortality occurred within 24 hours (Aedes). The test plant was either 
cabbage leaf (for Pieris) or turnip leaf (for Phaedon). 
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FIGURE 16.8 Schematic representation of the three-dimensional structure of the 
Cryl9Aa protein, highlighting loops 1 and 2, which were modified to alter the 
insecticide's specificity. Adapted from Abdullah and Dean, Appl. Environ. Microbiol. 
70:3769-3771, 2004, with permission. 


modification of domain II is an effective means of increasing Cry toxicity to 
particular insects. In one series of experiments, researchers modified the 
insect specificity of Cryl9Aa. This was done by directed mutagenesis of the 
cry 19An gene, replacing a nucleotide sequence that encoded the amino 
acids Ser-Tyr-Trp-Thr in loop 1 of domain II with a sequence encoding 
Tyr-Gln-Asp-Leu-Arg and deleting a sequence in loop 2 encoding Tyr- 
Pro-Trp-Gly-Asp (Fig. 16.8). The decisions regarding which sequences to 
alter were based on computer models comparing the three-dimensional 
structure of Cryl9Aa with the structure of Cry4Ba. These changes—altera¬ 
tions of both loop 1 and loop 2 were required—yielded a modified Cryl9Aa 
protein whose insecticidal activity against the mosquito Aedes negypti was 
increased more than 42,000-fold while its activity against other insects was 
essentially unchanged. This work suggests that it may be possible to ratio¬ 
nally engineer various Cry toxins to have desired activities by manipu¬ 
lating specific amino acid sequences within the protein loops. However, 
even if the genetic manipulations are successful and designer-engineered 
Cry proteins are attainable, it remains to be seen whether the general public 
and the regulatory authorities in various countries will embrace this tech¬ 
nology, which would include releasing genetically manipulated bacteria 
into the environment. 

Improving Delivery of a Mosquitocidal Toxin 

The B. thuringiensis subsp. israelensis insecticidal protein is highly toxic 
when ingested by mosquito larvae. Since 1982, it has been used successfully 
worldwide to control mosquitoes and blackflies. However, the parasporal 
crystal of this species sinks rapidly after it is sprayed on water, which effec¬ 
tively removes it from the feeding area of mosquito larvae and dramatically 
decreases its efficacy as a mosquitocide. To overcome this shortcoming, sev¬ 
eral approaches have been attempted. Currently, B. thuringiensis subsp. 
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israelensis insecticidal protein is available as granules or as slow-release 
rings or brickettes, which float on the surface of water. Another solution is 
to introduce the insecticidal toxin gene into organisms that are common 
food sources for mosquito larvae. Good candidate organisms for this pur¬ 
pose include Synechocystis and Synechococcus spp., which are photosynthetic 
cyanobacteria that proliferate near the water surface, where there is suffi¬ 
cient light for their growth and where mosquito larvae are normally found. 
Another organism with the potential to be a host for the expression of for¬ 
eign insecticidal toxin genes is Caidobacter crescentus, an aquatic bacterium 
that generally is widely distributed throughout aquatic environments where 
mosquito larvae feed. The toxin gene from B. thuringiensis subsp. israelensis 
was introduced into and expressed in these organisms. In laboratory trials, 
the insecticidal toxin that was produced by either transformed cyanobac¬ 
teria or C. crescentus was toxic to mosquito larvae. However, in field trials, 
transformed cyanobacteria or C. crescentus expressing B. thuringiensis insec¬ 
ticidal toxin genes had poor viability, and the cloned genes were expressed 
at a low level. 

A possible alternative host for the expression of mosquitocidal cry 
genes is Asticcacaulis excentricus, a gram-negative aerobic bacterium that is 
found in aqueous environments near the surface of the water. In a series of 
experiments, A. excentricus was transformed with a broad-host-range 
plasmid vector that carried the genes for mosquitocidal toxin proteins pro¬ 
duced by a strain of Bacillus sphaericus (a bacterium similar to B. thuringi¬ 
ensis) under the control of the tael promoter, which is a variant of the tac 
promoter. This transformant produced insecticidal toxin proteins of 51 and 
42 kDa and was almost as toxic to Anopheles and Culex mosquito larvae as 
the naturally occurring high-toxicity strains of B. sphaericus. However, 
unlike B. sphaericus, A. excentricus does not sink when it is sprayed onto 



MILESTONE 


Cloning and Expression of the Bacillus thuringiensis 
Crystal Protein Gene in Escherichia coli 

H. E. Schnepf and H. R. Whiteley 
Proc. Natl. Acad. Sci. USA 78:2893-2897, 1981 


A lthough it had been well 

known for a long time that the 
parasporal crystal that is pro¬ 
duced by B. thuringiensis contained 
insecticidal activity, it took scientists 
many years before the conditions for 
solubilizing the crystal were discov¬ 
ered and the insecticidal protein toxin 
was isolated in a pure form. Moreover, 
although some B. thuringiensis 
mutants that were defective in the 
synthesis of the parasporal crystal 
were known, protocols for the genetic 
transformation of B. thuringiensis were 
not well developed. Therefore, when 
Schnepf and Whiteley decided to iso¬ 


late B. thuringiensis insecticidal toxin 
protein genes, they were limited to 
screening E. coli transformants car¬ 
rying B. thuringiensis DNA either 
immunologically, using antibodies 
directed against the whole crystal, or 
by the insecticidal activity of extracts 
of the transformants. Since evidence at 
the time suggested that the insecti¬ 
cidal toxin was probably plasmid 
encoded, clone banks were con¬ 
structed from fractionated plasmid 
preparations with the idea of signifi¬ 
cantly enriching the clone banks for 
the presence of insecticidal toxin 
genes. Moreover, despite some con¬ 


cerns that antibodies against the 
whole (B. thuringiensis-produced and 
glycosylated) crystal protein might not 
interact with (E. coZ/-produced and 
nonglycosylated) crystal protein sub¬ 
units in solution, this did not turn out 
to be a problem. Finally, extracts of the 
particulate fraction of E. coli transfor¬ 
mants carrying the B. thuringiensis 
insecticidal toxin gene were found to 
be toxic to susceptible insects. This 
first cloning of a B. thuringiensis insec¬ 
ticidal toxin gene made it clear to 
workers in this field that these genes 
could be isolated in a straightforward 
manner and provided an impetus for 
increased activity both in the search 
for new strains of B. thuringiensis and 
in studies of the biochemistry of the 
insecticidal toxin. 
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ponds infested with mosquito larvae. Moreover, A. excentricus is inexpen¬ 
sive to produce, as it can be grown on much simpler media than either B. 
sphaericus or B. thuringiensis. It does not have a high level of protease 
activity, so the insecticidal toxin is not readily degraded. It is well adapted 
to environments such as those near the surface of standing water that are 
exposed to relatively high levels of UV light. Thus, A. excentricus cells 
should not be as sensitive to inactivation by UV light as those of either B. 
sphaericus or B. thuringiensis. However, to use a genetically engineered 
strain of A. excentricus to control mosquito populations in the environment, 
it will be necessary to integrate the insecticidal toxin genes into the chro¬ 
mosomal DNA without any antibiotic resistance genes. 

Protecting Plant Roots 

Insects that attack the roots of plants are not affected by B. thuringiensis- 
based insecticides that are sprayed onto leaves and shoots. However, it is 
possible to introduce the toxin gene from a B. thuringiensis strain into a 
bacterial species that colonizes the region adjacent to plant roots (the rhizo- 
sphere). The engineered bacteria could be introduced into the soil, where 
they would synthesize the insecticidal toxin and release it into the area 
immediately surrounding the plant roots, thereby conferring protection 
against root-attacking insects. In addition, as long as the engineered bac¬ 
teria were able to persist in the soil, they would continue to synthesize the 
insecticidal toxin, thus obviating the need for repeated spraying of either 
biological or chemical insecticides. This approach has been tested on a 
small scale. The gene for the B. thuringiensis subsp. kurstaki insecticidal 
toxin was integrated into the chromosomal DNA of a strain of P. fluorescens 
that colonizes com (maize) roots. The integration of the toxin gene was 
achieved as follows (Fig. 16.9). 

1. A transposon Tn5 element that had been cloned into a plasmid was 
genetically modified by altering portions of its left and right bor¬ 
ders and deleting its transposase gene. Such an altered Tn5 ele¬ 
ment cannot be excised from the plasmid, even by exogenous 
transposase, because the left and right borders are not are recog¬ 
nized by the transposase. 

2. An isolated B. thuringiensis subsp. kurstaki insecticidal toxin gene 
was spliced into the middle of the altered Tn5 element on the 
plasmid and placed under the control of a constitutive promoter. 

3. A wild-type Tn5 element was transposed into the chromosome of 
the root-colonizing strain of P. fluorescens. 

4. The plasmid carrying the altered Tn5 element with the inserted 
toxin gene was introduced into P. fluorescens carrying the inte¬ 
grated wild-type Tn5 element. 

5. Homologous recombination by means of a double crossover 
between the nontransposable Tn5 element on the plasmid that car¬ 
ried the toxin gene and the chromosomally integrated wild-type 
Tn5 led to the integration of the altered Tn5 with the toxin gene 
into the chromosomal DNA, with the concomitant loss of the wild- 
type Tn5 element. 

In this form, the toxin gene is unlikely to be lost either during large- 
scale laboratory growth or after release of the engineered microorganism 
into the environment. Also, the probability of transfer of the toxin gene to 
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FIGURE 16.9 Procedure for the development of a genetically engineered P. fluorescens 
strain that carries a copy of the B. thuringiensis insecticidal toxin gene integrated 
into its chromosomal DNA. The B. thuringiensis insecticidal toxin gene is cloned into 
an excision-defective variant of Tn5 on a plasmid. This construct is introduced into 
a P. fluorescens strain containing a wild-type Tn5 sequence that has been integrated 
into its chromosomal DNA. By homologous recombination, the excision-defective 
Tn5 element carrying the B. thuringiensis insecticidal toxin gene becomes integrated 
into the P. fluorescens chromosome. 


other microorganisms in the environment is very low. Laboratory trials 
showed that the engineered P. fluorescens was toxic to tobacco homworm 
larvae. However, the ability of this genetically manipulated microorganism 
to minimize root damage from insect predation remains to be tested in the 
greenhouse and in open-field trials. 

In other laboratories, various B. thuringiensis insecticidal toxin genes 
have been introduced into the chromosomal DNA of a number of dif¬ 
ferent microorganisms. For example, the crylAc genes were introduced 
into a strain of P. fluorescens and found to protect sugarcane plants against 
the sugarcane borer, Eldana saccharine. Also, when this gene was used to 
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transform Clavibacter xyli subsp. cynodontis, a bacterium that normally 
inhabits the xylem of Bermuda grass, the bacterium protected corn plants 
from damage caused by the European corn borer, Ostrinia nubilalis. 

Protoxin Processing 

Occasionally, during the proteolytic processing of the protoxin to the active 
toxin, the insect midgut proteases continue to cleave the toxin protein and 
thereby render it inactive (Fig. 16.10). This degradation process occurred 
when the 63-kDa Cry2Aal protoxin was treated with the midgut juices of 
the gypsy moth (Lymantria dispar). The midgut juices, which contained the 
protoxin-processing protease, first cleaved the protoxin on the C-terminal 
side of Tyr49, producing the active 58-kDa Cry2Aal toxin. However, con¬ 
tinued incubation of the toxin with the midgut enzymes resulted in the 
cleavage of the toxin on the C-terminal side of Leul44. This second cleavage 
inactivated the toxin, producing an inactive 49-kDa protein, dramatically 
reducing its effectiveness. To ascertain that this result was not an artifact, 
researchers radiolabeled the Cry2Aal protoxin and showed that this exces¬ 
sive cleavage also occurred in vivo. To try to prevent the production of the 
inactive form of the toxin, the protoxin gene was altered in five different 
ways by site-directed mutagenesis. The amino acid residue in position 144 
of the protoxin was changed from leucine to aspartic acid, alanine, glycine, 
histidine, or valine. All of the mutant proteins yielded a higher level of 
active toxin than the native form, and with the exception of the leucine-to- 
histidine change, the active mutant toxin proteins were no longer cleaved 
to an inactive form. 

Since the C-terminal half of many B. thuringiensis insecticidal protoxins 
is not toxic to insects, it would be advantageous if that half of the protein 
could be eliminated. Then, the cellular resources that had previously gone 
into synthesizing the C-terminal half of the protoxin might be used to syn¬ 
thesize more of the active toxin, thereby increasing the amount of toxin that 
a bacterium might produce. Unfortunately, when such truncated cryl genes 


FIGURE 16.10 Activation and subsequent cleavage of B. thuringiensis Cry2Aal pro¬ 
toxin by L. dispar midgut enzymes. Activation occurs by cleavage of the protoxin on 
the C-terminal side of Tyr49. Toxin inactivation occurs when the protein is cleaved 
on the C-terminal side of Leul44. When Leul44 is changed to one of several dif¬ 
ferent amino acids, the toxin is both active and resistant to further proteolytic 
cleavage. 
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FIGURE 16.11 A genetic construct that produces a high level of stable truncated CrylC 
protein. This construct includes the promoter from a cytlA gene (p cl,tlA ), a ribosome¬ 
binding site (RBS) from cry3A that stabilizes mRNA, a cry3A gene transcription 
terminator region (TT), a gene from the cryllA operon encoding a 20-kDa chap¬ 
erone-like protein that stabilizes the structure of the CrylC protein, and a gene from 
the cry2A operon encoding a 29-kDa protein that facilitates protoxin crystal forma¬ 
tion. Each of the last two genes has its own promoter, ribosome-binding site, and 
transcription termination site (not shown). 


were expressed in B. thuringiensis, the toxin yields were low and crystals 
did not form. To remedy this situation, several genetic elements that were 
known to enhance the synthesis and crystallization of "naturally trun¬ 
cated" Cry proteins were tested both separately and together in order to 
improve the stability and yield of truncated CrylC (Fig. 16.11). The con¬ 
struct that significantly increased both the stability and yield of truncated 
CrylC protein, with the truncated protein now forming crystals within 
sporulated cells, contained a number of different genetic elements, 
including the gene for a 20-kDa chaperone-like protein and a 29-kDa pro¬ 
tein that facilitated protoxin crystal formation. 

Preventing the Development of Resistance 

When B. thuringiensis subsp. kurstnki is used as an insecticide in a controlled 
environment where there is no sunlight to rapidly break down the pro¬ 
toxin, e.g., when stored grain is treated to protect it against insect preda¬ 
tion, resistant target insects develop within a few generations. This 
inherited resistance is typically due to an alteration in a midgut membrane 
protein that normally acts as a receptor for the B. thuringiensis subsp. 
kurstnki toxin. Resistant insects accumulate because the protoxin persists 
under these conditions and selects for resistant individuals. The lesson here 
is that the simplest way to avoid selecting for insects that are resistant to B. 
thuringiensis subsp. kurstnki in the absence of sunlight is to limit the use of 
this bacterium to field applications. However, extensive annual use, even 
under natural conditions, may result in a level of persistence high enough 
to allow selection to occur. Certainly, as larger quantities of B. thuringiensis 
are used over a wider geographical area, the probability that resistant 
strains of insects will be selected will increase. Various ways to avert this 
problem are being examined. These strategies, which may be utilized either 
with B. thuringiensis that is sprayed or with transgenic plants expressing 
the insecticidal toxin, include the following. 

1. The use of two or more B. thuringiensis insecticidal toxins at the 
same time. Provided that the toxins bind to different receptors, it is 
extremely unlikely that an insect will develop resistance to both 
toxins at the same time. When this approach is used in transgenic 
plants, it is often called "gene pyramiding." 

2. Application of a B. thuringiensis insecticidal toxin along with tra¬ 
ditional chemical insecticides. The idea here is that almost no 
insect survives these two very different treatments, and resistance 
does not develop to either. Transgenic plants that produce a 
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B. thuringiensis insecticidal toxin are commonly treated with 
chemical insecticides. However, the number of chemical insecti¬ 
cide treatments is significantly reduced when the plants produce 
a B. thuringiensis insecticidal toxin. In Florida, nontransgenic com 
plants often require as many as 10 sprayings of chemical insecti¬ 
cides per growing season. Plants that produce a B. thuringiensis 
insecticidal toxin are more likely to be sprayed with chemical 
insecticides only about three or four times a season. 

3. Application of a B. thuringiensis insecticidal toxin at the same time 
as another biologically based insecticidal protein (typically isolated 
from plants; see chapter 18). Again, it is extremely unlikely that the 
target insects will survive both types of insecticides. 

4. The use of two B. thuringiensis insecticidal toxins, one of which has 
had its toxin gene modified so that it binds to a different receptor 
than the other toxin. 

5. The use of refugia (small tracts of land where the crop is not treated 
with the microbial insecticide). Approximately 20% of a crop is not 
sprayed with B. thuringiensis (or 20% is nontransgenic, with the 
remaining 80% of the plants being transgenic and producing a B. 
thuringiensis insecticidal toxin). The wild-type insects can prolif¬ 
erate in the absence of the B. thuringiensis insecticidal toxin, and 
only (a very small number of) mutant insects that are resistant to 
the high levels of B. thuringiensis insecticidal toxin survive in the 
presence of the toxin. Upon mating, the small number of resistant 
insects will all mate with sensitive insects, so that the next genera¬ 
tion will contain mostly homozygous sensitive insects and a small 
number of heterozygous sensitive insects. This strategy assumes 
that resistance to the B. thuringiensis insecticidal toxin is inherited 
as a recessive trait. This approach has been used in the field for a 
number of years, with all of the available evidence indicating that 
little to no resistance to any B. thuringiensis insecticidal toxins has 
developed. 

As noted above, fusion of the coding portions of the active regions of 
two different toxin genes is another way of generating a novel protein with 
extended toxicity. This idea has been examined in laboratory experiments. 
When a series of lepidopteran-specific hybrid toxins were constructed, 
some of them were more toxic than the products of either of the contrib¬ 
uting genes by themselves, and in one case, a hybrid protein had acquired 
a totally new biological activity. 

Generally, resistance to B. thuringiensis insecticidal toxins is the conse¬ 
quence of a mutation(s) that alters an insect midgut receptor protein(s) so 
that it no longer binds to the Cry protein. However, if a toxin gene were 
engineered so that the toxin bound to more than one midgut cell surface 
protein, then resistance might be less likely to arise, since it would require 
alterations to several proteins. 

The insecticidal proteins CrylCa and CrylEa are both toxic to lepi- 
doptera but have different species specificities. CrylCa is active against S. 
exigua, Mamestra brassicae, and Manduca sexta, while CrylEa is active only 
against M. sexta. In one experiment, hybrid CrylCa-CrylEa proteins were 
constructed and tested for their toxicities to different insect species, as well 
as for their abilities to bind to different receptors (Fig. 16.12). The hybrid 
toxin G27, which contained domain III from CrylCa, was toxic to S. exigua 
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FIGURE 16.12 Toxicities and binding specificities of CrylCa, CrylEa, and hybrid 
toxins G27 and F26. A toxin-receptor-binding assay was used to determine binding 
specificities. Unlabeled CrylCa or CrylEa was added to a complex of the S. exigua 
midgut receptor protein and radiolabeled toxin, and the extent of binding of the 
radiolabeled toxin was determined. Adapted from Bosch et al.. Bio/Technology 
12:915-918,1994. 


larvae even though it bound to the CrylEa receptor but not to the CrylCa 
receptor (Fig. 16.13). Conversely, the hybrid toxin F26 was not toxic to S. 
exigua larvae even though it bound to the CrylCa receptor. Since the 
CrylCa and G27 proteins bind to different insect midgut receptors 
(although both are toxic to S. exigua), either simultaneous or alternating 
treatments of S. exigua with these two B. thuringiensis insecticidal toxins 
might limit the development of strains that are resistant to the toxins. 
Resistance to both Cry 1C and G27 would require mutations in two separate 
midgut proteins. 

6. thuringiensis subsp. israelensis thwarts insect resistance. In contrast to 
what has been observed with other strains of B. thuringiensis, no instances 
of field resistance of mosquitoes to B. thuringiensis subsp. israelensis have 
ever been reported, and only low levels of resistance have been observed 
in laboratory studies. This lack of insect resistance may reflect the fact that, 
in addition to synthesizing at least three different Cry proteins—Cry4A, 
Cry4B, and CryllA— B. thuringiensis subsp. israelensis also produces 
CytlA, a highly hydrophobic endotoxin that is not at all homologous to 
any of the Cry proteins and appears to have a completely different mode 
of action. While Cry proteins bind to glycoproteins on the insect midgut 
epithelial membrane, the primary affinity of CytlA is the lipid component 
of the membrane, especially the unsaturated fatty acids. CytlA acts syner- 
gistically with the Cry proteins, and its presence may explain why mos¬ 
quitoes do not develop resistance to the Cry proteins. In one series of 
experiments, using purified insecticidal proteins, it was demonstrated that 
with the addition of the CytlA protein, insects that had become resistant 
to Cry4A, Cry4B, and CryllA (all of which are encoded by B. thuringiensis 
subsp. israelensis) were killed when they were treated with B. thuringiensis 
subsp. israelensis. Recent experiments suggest that, following the binding 
of CytlA to the midgut epithelial membrane, the protein can act as a 
receptor for some of the Cry proteins encoded by B. thuringiensis subsp. 
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FIGURE 16.13 Schematic representation of CrylC binding to the CrylC receptor and 
hybrid G27 binding to the CrylE receptor. For CrylC, Cl, CII, and CIII indicate that 
these domains all originate from CrylC. For G27, Eli and EIII indicate that these 
domains are from CrylE. 


ismelensis. The reason that B. thuringiensis subsp. israelensis is a highly 
effective insect pathogenic bacterium may be because the strain not only 
carries several insecticidal proteins, but also contains a protein that acts as 
the receptor for these insecticidal proteins. It is therefore extremely 
unlikely that any target insect will be able to develop resistance to B. thu- 
ringiensis subsp. israelensis. To capitalize on the advantage that the CytlA 
protein provides to B. thuringiensis subsp. israelensis, genes for CytlA and 
CrylAc (which is typically found in B. tlmringiensis subsp. kurstaki strains 
and targets lepidopteran larvae) were used to transform a strain of B. tlm¬ 
ringiensis (Fig. 16.14). The combination of these two proteins in one strain 
was highly toxic to the diamondback moth ( Plutella xylostella), a lepi¬ 
dopteran species. On the other hand, strains that expressed one or the 
other of these proteins, but not both, required extremely high levels of the 
proteins before any toxicity could be detected. These results suggest, in 
this case, that CytlA (which usually targets diptera) is uncharacteristically 
behaving as a receptor for CrylAc (a lepidopteran toxin). It will be exciting 
to ascertain whether this synergism can be extended to other Cry proteins 
combined with CytlA. 

Other strategies that have been proposed as a means of avoiding the 
development of insects that are resistant to B. tlmringiensis insecticidal 
toxins include alternating the strain of B. thuringiensis that is employed 
from one season to the next, alternating B. thuringiensis treatment with the 
use of chemical or other biological insecticides, or applying mixtures of dif¬ 
ferent strains of B. thuringiensis. 

Improved Biocontrol 

Insects such the sugarcane borer (E. saccharine) that attack the internal 
regions of plants such as sugar cane are not affected by B. thuringiensis- 
based insecticides that are sprayed onto leaves and shoots. However, it is 
possible to introduce the toxin gene from a B. thuringiensis strain into a 
bacterium that colonizes either plant roots or interior surfaces. In these 
instances, the insecticidal toxin is delivered to the part of the plant that is 
normally attacked by the insect. 
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FIGURE 16.14 Schematic representation of an engineered strain of B. thuringiensis 
encoding both CytlA (which is from B. thuringiensis subsp. israelensis and becomes 
the membrane-bound receptor) and CrylA (the lepidopteran-specific insecticidal 
toxin) proteins. 


In one series of experiments, researchers genetically engineered two 
different P. fluorescens strains that, when fed to E. saccharina larvae, acted 
synergistically in limiting the proliferation of the insect. One P. fluorescens 
strain was engineered to express the crylAc7 gene under the transcriptional 
control of the tac promoter, with the entire construct integrated into the 
host chromosomal DNA. The other P. fluorescens strain was engineered to 


TABLE 16.5 Synergistic effects of P. fluorescens expressing CrylAc7 toxin and 
P. fluorescens expressing chitinase on sugarcane borer larvae 


Concentration (mg/g of diet) 

Insect mortality (%) 

Toxin-producing strain 

Chitinase-producing strain 

Day 2 

Day 5 

0 

0 

5.5 

7.6 

0.3 

0 

12.5 

33.8 

3.0 

0 

30.8 

42.7 

0 

0.3 

8.2 

20 

0 

30.0 

21.8 

42.7 

0.3 

0.3 

39.3 

55.7 

0.3 

30.0 

37.5 

68.3 


Adapted from Downing et al., Appl. Environ. Microbiol. 66:2804-2810, 2000. 
Day 2 and Day 5 indicate the number of days after the treatment was started. 
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FIGURE 16.15 Location of a-chymotrypsin recognition and cut site within domain I 
of Cry3A and mCry3A. The letters represent the amino acid residues located 
between a-helices 3 and 4. 


express a chitinase gene, originally isolated from the bacterium Serratia 
marcescens, also under the control of the tac promoter and integrated into 
the host chromosomal DNA. The chitinase is believed to cause perforations 
in the chitin-containing peritrophic membrane of the insect larvae, thereby 
lysing the membrane and killing the larvae, as well as increasing the acces¬ 
sibility of the midgut membranes to the B. thuringiensis insecticidal toxin. 
In laboratory tests, each of these P. fluorescens strains separately was toxic 
to E. saccharina larvae (Table 16.5). Moreover, when the two P. fluorescens 
strains were used together, there was significant synergism between the 
treatments so that low levels of both strains yielded a high level of insect 
mortality. Ideally a bacterial endophyte that can colonize the interior sur¬ 
faces of the plant, rather than a surface-colonizing bacterial strain, would 
be preferred as a host strain, and of course, the efficacy of any construct 
must be demonstrated in the field, as well as in the laboratory. 

Other workers have reported that the expression of a chitinase gene in 
a mosquitocidal strain of B. sphaericus (which is similar to B. thuringiensis 
subsp. israelensis) yielded a recombinant strain that was -4,300 times more 
toxic than the wild type against a strain of the mosquito Culex quinquefas- 
ciatus that is considered to be resistant to the wild-type strain. The higher 
toxicity of the chitinase-expressing strain is thought to reflect the fact that 
chitinase digestion facilitates the interaction between the insecticidal toxin 
and its target cells. 

While the Cry3 A protein is an effective insecticide against the Colorado 
potato beetle (Leptmotarsa decemlineata) , it shows very little activity against 
the western corn rootworm (Diahrotica undecempunctata howardi). Researchers 
speculated that the low level of activity against the western corn rootworm 
might reflect the fact that in this insect, proteolytic cleavage of the pre¬ 
cursor form of the toxin is not sufficient for biological activity. That is, 
processing of Cry3A by the protease chymotrypsin might be necessary to 
increase the solubility and functional binding of the insecticide to the insect 
brush border membrane. It was found that the normal Cry3A chymotrypsin 
cleavage site (Fig. 16.15) was not efficiently cleaved in vitro. However, 
when a new enzyme recognition site for chymotrypsin was introduced into 
Cry3A (near the existing site), cleavage of the modified protein (mCry3A) 
by chymotrypsin increased substantially, and the protein solubility and 
insecticidal activity of the protein against the western com rootworm, both 
in vitro and in vivo, also increased. Before a strain of B. thuringiensis that 
carries a gene for this modified Cry3A can be used in the environment, it 
will be necessary to elaborate its complete insect specificity and to ascertain 
(first in the laboratory) that this small modification in the structure of 






TABLE 16.6 Some insect pests that are currently controlled with baculoviruses 


Pest 

Common name 

Crop 

Anticarsia gemmatalis 

Velvetbean caterpillar 

Soybean 

Chrysomela scripta 

Cottonwood leaf beetle 

Trees 

Cydia pomonella 

Codling moth 

Apple, walnut 

Heliothis sp. 

Cotton bollworm 

Cotton, sorghum 

Lymantria dispar 

Gypsy moth 

Deciduous trees 

Mamestra brassicae 

Cabbage moth 

Vegetables 

Neodiprion sertifer 

European pine sawfly 

Pine 

Oryctes rhinoceros 

Rhinoceros beetle 

Coconut 

Spodoptera exigua 

Beet armyworm 

Vegetables, flowers 

Spodoptera littoralis 

Egyptian cotton leaf worm 

Cotton 

Trichoplitsia ni 

Cabbage looper 

Brassicas 


Cry3A has not inadvertently generated any toxic activities against humans 
or other animals. 

Activated Cry toxins bind to specific proteins (cadherins) on the sur¬ 
faces of the microvilli of the insect midgut epithelial cells. Binding of toxin 
monomers to cadherins, which are transmembrane glycoproteins con¬ 
taining 12 cadherin repeating domains and one membrane-proximal extra¬ 
cellular domain (Fig. 16.16), facilitates the development of a multimeric 
form of the toxin monomers and formation of a pore in the membrane. Loss 
of cadherin or mutation of cadherin genes is generally associated with 
resistance to B. thuringiensis. A fragment of a cadherin protein containing 
the 12 repeating units and the membrane-proximal extracellular domain 
was mixed with CrylA and fed to insect larvae. It was expected that the 
cadherin protein fragment would block the binding of the CrylA protein to 
the midgut epithelial cells. Instead, its addition dramatically enhanced the 
CrylA-induced insect mortality. The cadherin peptide fragment may first 
bind to microvilli and then attract CrylA molecules, thereby increasing the 
probability of the toxin interacting with the bona fide receptor. It is thought 
that this approach, that is, the simultaneous application of Cry proteins and 
a peptide containing a portion of the receptor protein, will overcome or 
significantly delay the development of insect resistance by increasing Cry 
protein insect toxicity. 


Baculoviruses as Biocontrol Agents 

Mode of Action 

Baculoviruses are rod-shaped double-stranded DNA viruses that can infect 
and kill a large number of different invertebrate organisms. Subgroups of 
this viral family are pathogenic to several orders of insects, including the 
Lepidoptera, Hymenoptera, Diptera, Neuroptera, Trichoptera, Coleoptera, 
and Flomoptera. In nature, some of these baculoviruses are important for 
the control of certain pest insects, and several have been registered for use 
as biological insecticides. Baculoviruses were used in North America against 
forest pests, such as the spruce sawfly (Neodiprion sertifer), starting in the 
1930s and ending with the advent of chemical pesticides in the 1960s. 
Baculoviruses continue to be used on a limited basis—approximately 0.1% 
of the money spent on pest control is directed toward baculoviruses— 
mostly by the forestry industry in an effort to control the gypsy moth, L. 
dispar (Table 16.6). It has been estimated that the costs associated with the 
development, production, and use of baculoviruses in developing countries 



FIGURE 16.16 Schematic representation 
of a cadherin protein molecule embed¬ 
ded in a midgut epithelial membrane. 
Twelve cadherin domains are labeled. 
Domains 7, 11, and 12, which are high¬ 
lighted, have been implicated as bind¬ 
ing sites for Cry proteins. The purple 
region represents the membrane- 
proximal extracellular domain; the 
transmembrane domain is shown in 
light blue; and the cytoplasmic domain 
is shown in red. The arrow indicates 
the junction between the membrane- 
proximal extracellular domain and the 
transmembrane domain. The cadherin 
peptide analogue includes the 12 cad¬ 
herin domains and the membrane- 
proximal extracellular domain. 
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are much less than in industrialized countries. This is due to lower material 
and labor costs in developing countries, as well as to the smaller size of 
farms in some countries, lower levels of agricultural mechanization, and 
less expensive registration procedures. Nevertheless, even in the more 
developed countries of the world, the costs of the development and registra¬ 
tion of a naturally occurring baculovirus are much less than the costs for a 
chemical insecticide. However, because baculoviruses have a high degree of 
host specificity, it is necessary to develop a large number of different bacu¬ 
loviruses to deal with different insect pests, whereas a single chemical pes¬ 
ticide might control all of them. 

The vast majority of baculoviruses used as biological control agents are 
members of the genus Nudeopolyhedrovirus, and all subsequent discussion 
of baculoviruses refers to viruses from this genus. A baculovirus particle 
consists of a cylindrical nucleocapsid that surrounds the viral DNA. Often, 
in the nucleus of an infected cell, baculovirus particles are embedded in a 
crystalline protein matrix called an occlusion body. The occlusion body, or 
polyhedron, is largely composed of the protein polyhedrin. When an 
infected insect dies, millions of polyhedra are released. Upon ingestion by 
an insect, the polyhedra move to the midgut, where the alkaline environ¬ 
ment facilitates the dissolution of the polyhedrin protein coat, releasing 
infectious nucleocapsids. The nucleocapsids are taken up by the insect 
midgut cells and then migrate through the cytoplasm to the nucleus, where 
the nucleocapsid is removed. After viral replication, which takes place in 
the nucleus, and nucleocapsid assembly, some nucleocapsids are released 
by budding through the plasma membranes of infected cells into the circu¬ 
latory system of the insect. Consequently, the infection spreads to other 
cells throughout the insect. It usually takes about 10 rounds of viral replica¬ 
tion, or about 5 to 9 days, for the insect to die. At that stage, about 25% of 
the dry weight of the insect consists of polyhedra. 

A positive feature of using baculoviruses as biocontrol agents is that 
they generally have limited host ranges and do not affect nontarget organ¬ 
isms. However, this means that any particular baculovirus can be used to 
control only a limited number of insect pests. Since baculoviruses coevolved 
with their insect hosts over thousands of years, they are well adapted to 
avoid the insect's defense mechanisms, and resistance to these viruses 
develops only rarely, and much less frequently than resistance to B. thm- 
ingiensis. 

Control of the European spruce sawfly (Gilipinia hercyniae) population 
in eastern Canada is the best example of insect control by a baculovirus. 
European sawfly populations were reduced to below economic threshold 
levels by 1943 and remain under control today. Ironically, the reason why 
some baculoviruses are not used commercially is related to the effective¬ 
ness of the virus. If a virus is effective at preventing proliferation of a par¬ 
ticular insect species, the virus has to be applied only once every year or so, 
making it difficult for the industry to justify the high registration costs. 
Farmers and growers prefer to use a single insecticidal agent that can con¬ 
trol many different insect pests rather than a number of different insecti¬ 
cides, so if baculoviruses are to be used more extensively, their limited host 
range needs to be expanded. 

It has been known for some time that when insect cells are infected 
with two different strains of baculovirus at the same time, new variant 
viruses with slightly different specificities can form after the two starting 
viruses have replicated. These new viruses are the product of homologous 
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recombination between the two starting viruses. Detailed analysis of this 
phenomenon revealed that a region of DNA that is only 79 base pairs (bp) 
long and located within the pl43 helicase gene was sufficient to permit 
homologous recombination between different baculoviruses. More impor¬ 
tantly, this 79-bp DNA segment may be responsible for the host ranges of 
different baculoviruses. Therefore, alteration of some of the nucleotides 
within this 79-bp DNA segment may allow researchers to generate baculo¬ 
viruses with modified (expanded) insect specificities. 

Genetic Engineering for Improved Biocontrol 

Baculoviruses are relatively slow in killing target insects. Depending on 
conditions, it can take from a few days to several weeks before the viral 
infection leads to the host's death. To remedy this ineffectiveness, several 
attempts have been made to enhance the virulence of baculoviruses by 
introducing foreign genes that either severely impair or kill the targeted 
insect species (Table 16.7). One approach has been to use a gene that dis¬ 
rupts the normal life cycle of the insect when it is expressed within the host 
insect cells. 

During insect development, a reduction in the level of juvenile hor¬ 
mone in larvae initiates metamorphosis into pupae and leads to a cessation 
of larval feeding. The reduction in the juvenile hormone level is due to an 
increase in the amount of juvenile hormone esterase, an enzyme that con¬ 
verts the biologically active methyl ester form of juvenile hormone into an 
inactive acid form. Inhibition of juvenile hormone esterase activity leads to 
an in vivo accumulation of active juvenile hormone, so the larvae remain 
in the feeding stage longer, continue to grow, and eventually become giant 
larvae. Therefore, researchers reasoned that an experimentally induced 
increase in the supply of juvenile hormone esterase should lower the 
endogenous level of active juvenile hormone and cause a premature cessa¬ 
tion of feeding. Basically, their premise was that shortening the duration of 
larval feeding would curtail the extent of crop damage. 

To test this idea, the investigators first had to clone and express the 
gene for juvenile hormone esterase. This task was achieved by purifying 
the enzyme from the tobacco budworm (Heliothis virescens), determining its 
amino acid sequence, synthesizing a DNA oligomer that corresponds to a 
portion of the esterase amino acid sequence, and then using this oligomer 
as a hybridization probe. The coding sequence for juvenile hormone 
esterase was isolated from an H. virescens complementary DNA (cDNA) 
library and inserted into the genome of a baculovirus under the control of 
baculovirus transcription signals. When the cabbage looper (Trichoplusia ni) 


TABLE 16.7 Some genes that have been introduced into the baculovirus genome to 
increase insecticidal activity 


Gene 

Effect on host insect of introduced gene 

Diuretic hormone 

Reduced hemolymph volume 

Juvenile hormone esterase 

Feeding cessation 

B. thuringiensis toxin 

Feeding cessation 

Scorpion toxin 

Paralysis 

Mite toxin 

Paralysis 

Wasp toxin 

Premature melanization, low weight gain 




680 


CHAPTER 16 



FIGURE 16.17 Survival of T. ni larvae after treatment of cabbage leaves with either 
wild-type or recombinant baculovirus expressing a scorpion neurotoxin gene. The 
control cabbage plants were treated only with insect larvae. The lower the per¬ 
centage of live larvae recovered, the greater the killing of the larvae and the more 
effective the treatment. Plants were treated with baculovirus only once at the start 
of the experiment. 


at the first larval instar stage was treated with this genetically modified 
baculovirus, the amount of juvenile hormone in the insect was reduced by 
the cloned juvenile hormone esterase, and larval feeding and growth were 
dramatically curtailed relative to feeding and growth by the control larvae 
that were treated with native baculovirus. 

The usefulness of this approach for enhancing baculoviruses as general 
biocontrol agents has been questioned, because the reduction in larval 
feeding that is attributable to the effect of juvenile hormone esterase is con¬ 
fined to the first larval instar. Other stages of development are much less 
sensitive to this treatment. A baculovirus engineered to express juvenile 
hormone esterase would have to be applied when the majority of the target 
insect population was in its first larval instar stage, which, under natural 
conditions, is difficult to achieve. 

Another approach for enhancing the effectiveness of baculoviruses as 
a pesticide is to incorporate into the viral genome an insect-specific toxin 
gene that, when expressed during the viral infection cycle, will yield a 
potent insect toxin. The gene that encodes the insect-specific neurotoxin 
produced by the North African fat-tailed scorpion ( Androctonns australis 
Hector) was cloned into a baculovirus strain, and the genetically engi¬ 
neered virus was tested as a biological insecticide. This neurotoxin, which 
does not have any effect on mice, disrupts the flow of sodium ions in the 
neurons of targeted insects and eventually leads to paralysis and death. 
Laboratory-raised insects that were infected with a baculovirus carrying 
the scorpion neurotoxin gene caused 50% less damage to the leaves of test 
plants than did insects that had been treated with wild-type baculovirus. 

When the cDNA for the toxin from the Israeli yellow scorpion (Leiurus 
cjuincjuestriatus hebraeus) was cloned and introduced into the baculovirus 
Autographa californica multiple nuclear polyhedrosis virus, the time that it 
took to kill 50% of the insect larvae that were tested was reduced from 120 
to 78 h. Moreover, 120 h after infection, the insect larvae treated with 
recombinant virus gained only one-third as much weight as larvae treated 
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with wild-type virus. Thus, this engineered baculovirus not only hastened 
the demise of the infected insect larvae, it also significantly decreased the 
ability of the insects to damage plants. Subsequent experiments have taken 
this system a step further. Researchers have studied the effects of different 
scorpion toxins, either separately or two at a time. They assessed whether 
either a combination of excitatory and depressant or alpha and depressant 
scorpion toxins would improve the efficacy of A. californica nuclear polyhe- 
drosis virus, over a virus expressing only a single toxin, toward three dif¬ 
ferent insect larvae. The best result was achieved by combined expression 
of the excitatory toxin and the depressant toxin. Under these conditions, 
the "effective time to paralysis" of H. virescens neonates was reduced to 
slightly less than 47 h. Additional improvements to this system have come 
from placing the scorpion toxin(s) under the transcriptional control of the 
p-PCm promoter, which contains the human cytomegalovirus minimal 
( CMVm ) promoter ligated in cis with the polyhedrin upstream ( pn ) 
sequence. This results in a high level of expression of foreign genes at an 
early infection stage of the baculovirus. 

Recently, a genetically engineered A. californica nuclear polyhedrosis 
virus that expresses the insect-specific neurotoxin from Androctonns aus¬ 
tralis was tested under field conditions. Interestingly, the modified baculo¬ 
virus was even more effective in the field than in the laboratory studies that 
demonstrated a 25 to 50% reduction in the time it took to kill the insect pest 
T. ni (Fig. 16.17). In the field, the genetically engineered baculovirus killed 
the insect pests faster, decreased the damage to cabbage plants, and 
reduced the secondary cycle of infection (infections caused by the next 
generation of the virus) compared to the wild-type virus. 

No matter how effective a particular genetically engineered baculo¬ 
virus may be in small-scale experiments, a major hurdle to more wide¬ 
spread use is the difficulty and cost of propagating such viruses. 
Baculoviruses are obligate parasites; therefore, they must be grown either 
in living whole organisms or in insect cell culture. In more developed coun¬ 
tries, the cost of baculovirus preparations, whether or not the virus has 
been genetically engineered, is not currently competitive with that of 
chemical insecticides. However, biological insecticides may become more 
appealing when the adverse environmental impact of chemical insecticides 
is factored into the cost-benefit analysis. 


SUMMARY 


M icrobial insecticides are currently being developed as 
environmentally friendly biological substitutes for 
chemical pesticides. A number of subspecies of the bacterium 
B. thuringiensis produce a protoxin as part of a parasporal 
crystal that, after ingestion, kills specific insects. The transition 
from insecticidal protoxin to toxin occurs in the gut of the 
target insect and is mediated by the pH and digestive pro¬ 
teases in the gut. The death of the insect is the consequence of 
the formation of membrane channels in the gut cells, which 
allow ATP to escape and in turn lead to decreased cellular 
metabolism, cessation of feeding, dehydration, and eventually 
death. The B. thuringiensis toxins are highly specific for a lim¬ 
ited number of insect species, nontoxic to nontarget species, 
and biodegradable. Consequently, they are unlikely to cause 
significant biological selection for resistant forms under 


normal conditions. These attributes make these biological 
insecticides effective agents for controlling insect damage to 
certain crops and preventing the proliferation of insects that 
act as vectors of human diseases. 

The genes (cry) for various B. thuringiensis toxins have been 
cloned and characterized. By expressing a B. thuringiensis cry 
gene in a nonsporulating Bacillus strain, production of the 
insecticidal protein was achieved during vegetative growth, 
bypassing the need for parasporal crystal formation. 

To expand the specificity of a B. thuringiensis toxin to other 
pest insects, toxin genes from different subspecies were cloned 
into plasmids and introduced into another B. thuringiensis 
strain, either on a broad-host-range plasmid or by integration 
into the chromosomal DNA of the host cell. In addition to 
expressing the toxicity of the original strain, the bacteria with 
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two different toxin genes sometimes showed an effect against 
a nontarget insect pest. In one study, it was found that modi¬ 
fication of domain II of the Cry protein is an effective means 
of increasing its toxicity to particular insects. Similarly, a 
fusion protein consisting of two toxin domains from different 
B. thuringiensis toxin genes was constructed by genetic manip¬ 
ulation, and the fusion protein retained both toxic activities. In 
another study, the receptor-binding domain of one insecticidal 
toxin was combined with the toxin domain of another. It is 
thought that insect resistance is less likely to develop when 
such hybrid toxins are used. In addition, the simultaneous 
application of Cry proteins and a peptide containing a portion 
of the host Cry receptor protein increases Cry protein insect 
toxicity, another strategy that can be used to overcome or sig¬ 
nificantly delay the development of insect resistance to Cry 
proteins. A further strategy that can both improve biocontrol 
activity and serve to limit the development of B. thuringiensis- 
resistant insects is the use of B. thuringiensis toxins together 
with other insecticidal proteins, such as chitinase or the B. 
thuringiensis subsp. israelensis CytlA protein. 

To ensure that B. thuringiensis spraying for the control of 
mosquitoes is effective, the B. thuringiensis toxin genes have 


been cloned into various microorganisms that live near the 
surfaces of ponds and are eaten by mosquito larvae. This 
strategy appears to be an effective means of delivering the B. 
thuringiensis toxin to the targeted insect. Also, rhizosphere 
bacteria that have been engineered with B. thuringiensis toxin 
genes lessen the damage caused by insects that attack the 
roots of plants. 

Baculoviruses are pathogenic to many different species of 
insects, but each strain of baculovirus is specific to a small 
number of insect species. Although baculoviruses kill their 
host organisms, the process is usually considered to be too 
slow to be effective for controlling insects that attack crop 
plants. However, when certain genes are cloned into different 
strains of baculovirus, the virus can act as a delivery system 
for a gene that produces an insecticidal protein during the 
viral life cycle. Several tests of this strategy have been suc¬ 
cessful in laboratory trials. In addition, when a gene for a 
neurotoxin that kills insects was cloned into a baculovirus, the 
construct was effective in field trials. 
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REVIEW QUESTIONS 

1. What are the advantages of biological insecticides over 
chemical insecticides? 

2. Draw a simple phylogenetic tree that shows the relation¬ 
ship among CrylAa, CrylAb, CrylBa, CrylBb, Cry2Aa, and 
Cry2Ab. 

3. Why is the B. thuringiensis toxin not toxic to humans? 

4. Outline a strategy that you would use to isolate an insecti¬ 
cidal protoxin gene from B. thuringiensis subsp. israelensis. 
How would you use this gene in a practical way? 

5. How would you determine whether a particular insecti¬ 
cidal protoxin gene is present on a plasmid or part of the 
chromosome of a B. thuringiensis strain? 

6. How would you use genetic engineering to improve the 
usefulness of a particular B. thuringiensis protoxin? 

7. If a crylC-crylAb fusion gene encoding an insecticidal 
protoxin consists of approximately 2,200 bp of DNA from the 
crylC gene and 1,300 bp of DNA from the crylAb gene, what 
is the advantage of synthesizing this fusion protoxin com¬ 
pared with the cry 1C protoxin? 

8. How can insect gut enzymes be limited to processing the 


B. thuringiensis insecticidal protoxin to the active toxin 
without degrading the toxin? 

9. How would you engineer a Cry protein to lessen or avoid 
the development of insect resistance to this toxin? 

10. Why is the bacterium A. excentricus an attractive host 
organism for the expression of B. thuringiensis insecticidal 
toxin genes? 

11. How can the species range of an insecticidal B. thuringi¬ 
ensis strain be extended? 

12. What is a truncated B. thuringiensis insecticidal protoxin? 

13. Why is it unlikely that insects will ever develop resis¬ 
tance to B. thuringiensis subsp. israelensis strains? 

14. What are cadherins, and how can they be used to 
enhance the toxicity of a particular Cry protein? 

15. How would you improve the insecticidal properties of 
baculoviruses? 

16. How might it be possible to expand the range of insects 
that are infected by a particular baculovirus? 
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Large-Scale Production of 
Proteins from Recombinant 
Microorganisms 


T he production OF commercial products that are synthesized by 
genetically engineered microorganisms requires the partnership of 
two kinds of experts. Molecular biologists are responsible for iso¬ 
lating, characterizing, modifying, and creating effectively expressed genes 
in microorganisms that can be used for industrial production, and bio¬ 
chemical engineers ensure that the genetically engineered form of a micro¬ 
organism can be grown in large quantities under conditions that give 
optimal product yields. In the early days of molecular biotechnology, biolo¬ 
gists naively thought that scale-up was simply a matter of multiplication; 
i.e., they believed that whatever conditions were found to be effective on a 
small scale would be equally effective on a large scale and that to achieve 
this it was merely necessary to use a larger reaction vessel with a corre¬ 
spondingly larger volume of medium. 

This simplistic view is far from reality. For example, good growth of 
aerobic microorganisms can usually be achieved in a standard 200-mL 
laboratory flask that is aerated with a mixer driven by a 300-watt motor. If 
the system were directly scaled up, a single 10,000-liter container would 
require a mixer with a 15-megawatt motor. Such a motor would be as large 
as a house, and the heat generated during stirring would boil the microor¬ 
ganisms. Although biochemical engineers may quibble about some aspects 
of this specific example, they all know that the industrial production of 
microorganisms is not merely a multiplication of bench scale conditions. 
For a start, increasing the size of the reaction vessel (bioreactor, or fer¬ 
menter) is required for the large-scale growth of microorganisms, because 
it would be impractical to set up 50,000 individual culture flasks, each con¬ 
taining 200 mL, to obtain 10,000 liters of cell suspension. 

A number of parameters must be precisely regulated to obtain max¬ 
imum yields from either small (1- to 10-liter) or large (>l,000-liter) bioreac¬ 
tors. These parameters include the temperature, pFI, rate and nature of 
mixing of the growing cells, and, with aerobic organisms, oxygen demand. 
Moreover, the optimal conditions generally change with each 10-fold 
increase in the volume of a bioreactor. 
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There are also other technical considerations. The design of the biore¬ 
actor is important. It should ensure adequate sterility and provide appro¬ 
priate levels of containment of genetically engineered microorganisms. The 
reactor should also include probes that permit the accurate and continuous 
on-line monitoring of as many critical reaction parameters as possible so 
that adjustments can be made rapidly and easily throughout the course of 
the fermentation reaction (i.e., the growth of the microorganism). In addi¬ 
tion, because sterilization may alter the composition of the medium (e.g., 
by destroying vitamins), it is important to ascertain that the medium com¬ 
position is still optimal for maximal microbial growth following steriliza¬ 
tion. 

Generally, large-scale fermentation and product purification are step¬ 
wise processes (Fig. 17.1). A typical procedure begins with formulation and 
sterilization of the growth medium and sterilization of the fermentation 
equipment. The cells are grown first as a stock culture (5 to 10 mL), then in 


FIGURE 17.1 Generalized scheme for a large-scale fermentation process. The commer¬ 
cial product is usually in either the cell or cell-free fraction, but not in both; conse¬ 
quently, one or the other of these fractions will be processed further (+) or discarded 
(-)■ 
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a shake flask (200 to 1,000 mL), and then in a seed fermenter (10 to 100 
liters). Finally, the production fermenter (1,000 to 100,000 liters) is inocu¬ 
lated. After the fermentation step is completed, the cells are separated from 
the culture fluid by either centrifugation or filtration. If the product is intra¬ 
cellular, the cells are disrupted, the cell debris is removed, and the product 
is recovered from the debris-free fluid. If the product is extracellular, it is 
purified from the cell-free culture medium. 


Principles of Microbial Growth 

Microorganisms can be grown in batch, fed-batch, or continuous culture 
(Fig. 17.2). In batch fermentation, the sterile growth medium is inoculated 
with the appropriate microorganisms, and the fermentation proceeds 
without the addition of fresh growth medium. In fed-batch fermentation, 
nutrients are added incrementally at various times during the fermentation 
reaction; no growth medium is removed until the end of the process. In the 
continuous fermentation process, fresh growth medium is added continu¬ 
ously during fermentation, but there is also concomitant removal of an 
equal volume of spent medium containing suspended microorganisms. For 
each type of fermentation, oxygen (which is usually provided in the form 
of sterile air), an antifoaming agent, and, if required, acid or base are 
injected into the bioreactor as needed. 

Batch Fermentation 

During a batch fermentation, the composition of the culture medium, the 
concentration of microorganisms (biomass concentration), the internal 
chemical composition of the microorganisms, and the amount of either 
target protein or metabolite all change as a consequence of the state of cell 
growth, cellular metabolism, and availability of nutrients. Under these 
conditions, six typical phases of growth are usually observed: lag phase, 
acceleration phase, logarithmic (log) or exponential phase, deceleration 
phase, stationary phase, and death phase (Fig. 17.3). 

Typically, there is no immediate increase in the numbers of cells after 
the inoculation into sterilized growth medium. This initial period is called 


FIGURE 17.2 Schematic representation of the time course (progress curves) of cell 
concentration (mass) and substrate concentration in batch (A), fed-batch (B), and 
continuous (C) fermentations. 
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FIGURE 17.3 Pattern of microbial cell growth in a batch fermenter. The six phases of 
the growth cycle are (1) lag, (2) acceleration, (3) log (exponential), (4) deceleration, 
(5) stationary, and (6) death. 


the lag phase. During the lag period, the microbial cells adapt to the new 
environmental conditions. The cells may have to adjust to a different pH or 
to a new level of available nutrients. As part of an adaptive response, previ¬ 
ously unexpressed metabolic pathways may be induced. A lag phase gen¬ 
erally occurs whenever the cells of the inoculum are derived from a culture 
that has stopped growing (i.e., has entered stationary phase) because of 
substrate limitation or product inhibition. These cells need time to reset 
their metabolic systems to adjust to the new medium. The length of the lag 
phase corresponds to how long the inoculated cells were in stationary 
phase and the extent to which the previous growth medium of the starting 
cells differed from the new, fresh culture medium. Conversely, when the 
inoculum is a cell culture from a growing cell population in log phase, a 
discernible lag phase may not occur and growth may begin immediately. 
Following the lag phase, the brief period when the rate of cell growth 
increases until log-phase growth is attained is called the acceleration 
phase. 

During the log phase of growth, the cell mass undergoes several cell 
doublings and the specific growth rate of the culture remains constant. 
With excess substrate (nutrient supply) and no inhibition of growth by a 
compound that is present in the growth medium, the specific growth rate 
is independent of the substrate concentration. These changes and other 
related steps can be represented in mathematical form, making it possible 
for biochemical engineers to precisely model and then more easily scale up 
microbial cell growth. In this case, the rate of increase of the cell biomass 
with time, dX/dt, is the product of the specific growth rate, p, and the bio¬ 
mass concentration, X: 

dX/dt = pX 

Similarly, the rate of increase of the cell number, dN/dt, is the product of the 
specific growth rate, p, and the cell number, N: 

dN/dt = pN 

The specific growth rate, p, is a function of the concentration of the limiting 
substrate (i.e., the carbon or nitrogen source), S; the maximum specific 
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growth rate, p max ; and a substrate-specific constant, K s . Both S and K s are 
expressed in terms of concentration, e.g., in either grams or moles per 
liter: 


b = bma xS/(fC s + S) 

Sometimes, scientists refer to the doubling or generation time, f, of a culture 
rather than to its specific growth rate, ,u, where t = In 2/p. The generation 
time of a culture is the length of time that it takes, under defined condi¬ 
tions, for the number of cells or the cell biomass to double. For the micro¬ 
organisms that are commonly grown in culture, the value of p max varies 
from about 2.1 to 0.086 h -1 (reciprocal hours), which corresponds to dou¬ 
bling times of approximately 20 minutes to 8 hours. 

When there is an excess of substrate (i.e., when S » K s ), then p = p max 
and the maximal rate of log-phase growth of the culture occurs. In practice, 
the value of K s is often so low that substrate levels equivalent to K s are 
rarely encountered during log-phase growth. For example, for Escherichia 
coli, while the K s for glucose is approximately 1 mg/liter, the initial glucose 
level is usually around 10,000 mg/liter. However, as the culture nears the 
end of the log phase, the concentration of the remaining substrate, S, is 
depleted and may even fall below the value of K s . Under conditions where 
S < K s , the microorganisms rapidly enter the deceleration phase. However, 
because of the large cell population at the end of the log phase, the sub¬ 
strate may be so rapidly assimilated that the deceleration phase is short¬ 
lived and not observable. 

After either the depletion of a critical growth substance, such as the 
carbon source, from the medium or the accumulation of metabolic end 
products that inhibit growth, the increase in cell mass eventually ceases 
and the cells enter the stationary phase. During this phase, although the 
amount of biomass remains constant, cellular metabolism often changes 
dramatically; in some instances, compounds (secondary metabolites) that 
are of considerable commercial interest are synthesized. For example, anti¬ 
biotics are usually produced during the stationary phase of the microbial 
growth cycle. The duration of the stationary phase depends on the partic¬ 
ular organism and the conditions of growth. 

In the death phase, the energy reserves of the cell are virtually 
exhausted, and metabolic activity ceases. For most commercial processes, 
the fermentation reaction is halted and the cells are harvested before the 
death phase begins. 

Fed-Batch Fermentation 

In fed-batch fermentations, substrate is added in increments at various 
times throughout the course of the reaction. These additions prolong both 
the log and stationary phases, thereby increasing the biomass and the 
amount of synthesis of stationary-phase metabolites, such as antibiotics. 
However, microorganisms in stationary phase often produce proteolytic 
enzymes (proteases), and these enzymes can degrade proteins synthesized 
by a genetically engineered microorganism. Therefore, when proteins are 
produced from a recombinant microorganism, it is important that the fer¬ 
mentation reaction not be allowed to reach this part of the growth cycle. 
Because it is often difficult to measure the substrate concentration directly 
during the fermentation reaction, other indicators that are correlated with 
the consumption of substrate, such as the production of organic acids. 
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Expression of Intracellular Hemoglobin Improves 
Protein Synthesis in Oxygen-Limited Escherichia coli 

C. Khosla, J. E. Curtis, J. DeModena, U. Rinas, and J. E. Bailey 
Bio/Technology 8:849-853,1990 


ecause of the low solubility of 
oxygen in water, the growth of 
aerobic bacteria often becomes 
limited by the amount of dissolved 
oxygen in the fermentation broth. This 
problem is especially acute at high cell 
densities or when fermentations are 
performed on a large scale. To address 
this problem, chemical engineers have 
attempted to increase the rate of 
transfer of introduced oxygen to the 
liquid of the growth medium. Their 
approaches have included (1) sparging 
the growth medium with pure oxygen 
rather than with air; (2) introducing 
the air (or oxygen) under pressure; (3) 


adding chemicals, such as perfluoro- 
carbons, to the fermentation broth to 
increase the solubility of oxygen; and 
(4) modifying the configuration of the 
fermentation vessel to optimize the 
aeration or agitation of the fermenta¬ 
tion broth. While all of these solutions 
to the "oxygen problem" are some¬ 
what effective, they are subject to a 
threshold effect beyond which it is 
impossible to introduce a sufficient 
amount of oxygen to improve the final 
yield of the fermentation. 

As an alternative to these "hard¬ 
ware" solutions, Bailey and coworkers 
designed a biological system in which 


the host organism was modified so 
that it would be more efficient at 
using the low levels of oxygen that 
are normally present in the growth 
medium. They cloned a gene 
encoding a hemoglobin-like molecule 
from the gram-negative bacterium 
Vitreoscilla and introduced it into sev¬ 
eral different recombinant bacteria. 
The introduced bacterial hemoglobin 
bound oxygen from the environment 
and created a higher level of available 
oxygen within the cells, which 
resulted in an increase in growth and 
foreign-gene expression. This 
approach provided a clever biological 
solution to what at first glance 
seemed to be an almost insurmount¬ 
able engineering problem. 



changes in the pH, or the production of CO z , can be used to estimate when 
additional substrate is needed. Generally, fed-batch fermentations require 
more monitoring and greater control than batch fermentations and are 
therefore used to a lesser extent. However, since they may be advantageous 
in the development of systems for the production of proteins from recom¬ 
binant microorganisms, they are becoming increasingly popular. 

The periodic addition of substrate to a growing microbial culture pro¬ 
longs the log phase of growth and delays the onset of the stationary phase, 
which initiates cellular stress responses, the production of proteases, and 
other metabolic changes that affect the yield of a recombinant protein. 
Nevertheless, with continued cell growth, an increasing amount of the 
incoming substrate is needed for maintenance of the host cell metabolism. 
This means that fewer cellular resources are used for the synthesis of the 
recombinant protein or commercial metabolite(s). To ensure that the syn¬ 
thesis and stability of a recombinant protein are not impaired, increasing 
amounts of nutrients must be added to the growing culture. This may be 
done by carefully monitoring the fermentation reaction and adding sub¬ 
strates (carbon and nitrogen sources, together with trace elements) in 
increasing amounts as they are needed. Depending upon the particular 
microorganism, its genetic background, and the nature of the recombinant 
protein, a fed-batch fermentation strategy can increase the yield from 25% 
to more than 1,000% compared with batch fermentation. 

Fed-batch processes are not limited to microbial cells but are also used 
with mammalian and insect cells in culture. This is important because (1) 
these cell culture systems are increasingly being used for the production of 
human therapeutic proteins, and (2) in the absence of fed-batch strategies, 
animal cells in batch culture are not very efficient in producing foreign 
proteins. 
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Continuous Fermentation 

In a continuous fermentation, a steady-state condition, where dX/dt = 0, is 
attained when the total number of cells and the total volume in the bio¬ 
reactor remain constant. In other words, under these conditions, the loss of 
cells due to outflow (product removal) is exactly balanced by the gain in 
new cells by growth (division). In more formal terms, for a continuous 
steady-state fermentation process, the dilution rate, D, is defined as the 
volumetric flow rate, F, divided by the constant liquid volume, V, in the 
bioreactor: 


D = F/V 

where D is equal to the specific growth rate, g: 

D = (dX/dt)(l/X) = p 

To obtain hydrodynamically stable continuous cultures, the specific 
growth rate, g, of the culture must be lower than the maximum attainable 
specific growth rate, p max . In practice, this condition is achieved by adjusting 
a pump that controls the volumetric flow rate, F, while keeping the volume 
of the culture within the bioreactor, V, constant. 

The fundamental objective of industrial fermentations is to minimize 
costs and maximize yields. This goal can be achieved by developing the 
most efficient mode of fermentation for each particular process. Although 
the commercial use of continuous fermentation processes is typically lim¬ 
ited to production of single-cell protein, antibiotics, and organic solvents, 
primarily because of the greater experience that scientists have with 
growing cells in batch mode, the cost of producing a specific amount of cell 
biomass by continuous culture is potentially much lower than producing 
the same amount by batch fermentation. The following factors account for 
the savings. 

• Continuous fermentations use smaller bioreactors than batch fer¬ 
mentations to produce the same amount of product. 

• After a large-scale batch fermentation is completed, large-scale 
equipment is needed for cell harvesting, cell breakage, and subse¬ 
quent downstream processing (purification) of the protein or metab¬ 
olite product that is produced by the microorganism. Continuously 
grown cells, however, are produced "a little bit at a time," so that the 
equipment required for cell harvesting, cell breakage, and down¬ 
stream processing can be much smaller. 

• Continuous fermentation, by definition, avoids the "down time" 
between batch runs, during which the bioreactor is prepared for 
reuse. A common hindrance to efficient industrial fermentation is 
the loss of productivity due to the down time of the bioreactor for 
repairs, cleaning, or sterilization. Continuous fermentations have 
less down time, because a single reaction can be maintained for a 
much longer period. 

• The physiological state of the cells during continuous fermentation 
is more uniform, so that yields of product are more consistent. In 
batch fermentations, small differences in the timing of cell harvest, 
which coincides with the mid- to late log phase of growth, can lead 
to significant physiological differences. 
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Despite its merits, continuous fermentation has potential drawbacks 
that must be addressed before its use becomes more widespread. 

• The duration of a continuous fermentation can be 500 to 1,000 hours, 
and therefore, some cells might lose recombinant plasmid con¬ 
structs. Cells that lack plasmids usually have a smaller energy 
burden and divide faster than do plasmid-containing microorgan¬ 
isms, so that yields may decline with time because fewer cells are 
synthesizing the product protein. Integration of the cloned gene into 
the genome of the host organism avoids this problem. 

• Maintenance of sterile conditions on an industrial scale for long 
periods is difficult. Furthermore, continuous processes need sterile 
backup equipment, a requirement that can greatly increase capital 
costs. 

• The composition of culture medium that is used for industrial fer¬ 
mentations is not subject to the same level of quality assurance as 
that accorded laboratory medium components and therefore may 
vary from batch to batch. This variation can alter the physiology of 
the cells and decrease productivity. 

Because batch fermentation has a proven history of reliability, there is 
reluctance to switch to another type of fermentation system, even though a 
continuous mode of operation is generally regarded as the most efficient 
fermentation strategy. Nevertheless, a number of researchers have recently 
developed, on the scale of a laboratory (up to 10 liters) or pilot plant (up to 
1,000 liters), both continuous and fed-batch processes for the production of 
proteins from recombinant microorganisms. Therefore, it is probably only 
a matter of time before the use of continuous and fed-batch fermentations 
becomes more widespread in industry. 


Maximizing the Efficiency of the Fermentation Process 

Regardless of the type of fermentation process that is used to grow cells, it 
is necessary to monitor and control culture parameters, such as the dis¬ 
solved oxygen concentration, pH, temperature, and degree of mixing. 
Changes in any one of these parameters can have a dramatic effect on the 
yield of cells and the stability of the protein product. 

Optimal growth of E. coli cells and many other microorganisms that are 
used as hosts for cloned genes usually requires large amounts of dissolved 
oxygen. The maximal oxygen demand in a fermentation, Q max , is depen¬ 
dent on the cell mass, X; the maximal specific growth rate, p max ; and the 
growth yield based on oxygen consumed, Y 0 , where 

Qmax ^bmax/^Q^ 

Because oxygen is only sparingly soluble in water (0.0084 gram/liter at 
25°C), it must be supplied continuously—generally in the form of sterilized 
air—to a growing bacterial culture. However, the introduction of air into a 
bioreactor produces bubbles, and if the bubbles are too large, the rate of 
transfer of oxygen to the cells is insufficient to support optimal growth. 
Thus, fermenter design should include provision for monitoring the dis- 
solved-oxygen level of the culture, providing oxygen to the culture, and 
adequately mixing the culture to efficiently disperse the bubbles. 
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Most microorganisms grow optimally between pH 5.5 and 8.5. 
However, during growth in a bioreactor, cellular metabolites are released 
into the growth medium, a process that can change the pH of the medium. 
Therefore, the pH of the medium must be monitored and either acid or 
base must be added as needed to maintain a constant pH. Of course, the 
added acid or base must be well mixed into the fermentation broth so that 
the pH of the growth medium is the same throughout the entire reaction 
vessel. 

Maintenance of the correct temperature is essential for the success of a 
fermentation reaction. Microorganisms grown at a temperature below the 
optimum grow slowly and have a reduced rate of cellular production (pro¬ 
ductivity). On the other hand, if the growth temperature is too high—but 
not lethal—there may be premature induction of the expression of the 
target protein, if it is under the control of a temperature-sensitive repressor, 
or induction of a heat shock (stress) response, which will produce cellular 
proteases that lower the yield of the protein product. 

Adequate mixing of a microbial culture is essential for many aspects of 
a fermentation, including assurance of an adequate supply of nutrients to 
the cells and prevention of the accumulation of any toxic metabolic by¬ 
products in local, poorly mixed regions of the bioreactor. Effective mixing 
is relatively easily attained with small-scale cultures, but it is one of the 
major problems when the scale of fermentation is increased. 

Agitation of the fermentation broth also affects other factors, such as 
the rate of transfer of oxygen from the gas bubbles to the liquid medium 
and then from the medium to the cells, efficient heat transfer, accurate mea¬ 
surement of specific metabolites in the culture fluid, and efficient disper¬ 
sion of added solutions, such as acids, bases, nutrients, or antifoaming 
agents. On these grounds, it might be concluded that the more mixing there 
is, the better the growth. However, excessive agitation of a fermentation 
broth can cause hydromechanical stress (shear), which damages larger 
microbial or mammalian cells, and a temperature increase, which may also 
decrease cell viability. Thus, a balance must be struck between the need to 
provide thorough mixing and the need to avoid damage to the cells. 

There is an additional consideration for scaled-up fermentations that 
has nothing to do with the technical aspects of the process but depends 
instead on whether a genetically engineered microorganism is being used. 
In most countries, specific rules and regulations must be followed when 
genetically engineered microorganisms are grown on a large scale. 
Although most recombinant microorganisms are not hazardous, it is nev¬ 
ertheless important to ensure that they are not inadvertently released into 
the environment. Therefore, fail-safe systems are used to prevent accidental 
spills of live recombinant organisms and to contain them if they occur. 
Furthermore, all recombinant microorganisms must be treated by an 
approved procedure to render them nonviable before they are discharged 
from the production facility. The spent culture medium must also be treated 
to ensure that it does not contain viable organisms and that its disposal 
does not create an environmental problem. 

High-Density Cell Cultures 

A major objective of fermentation is to maximize the volumetric produc¬ 
tivity, i.e., to obtain the largest amount of product in a given volume in as 
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short a time as possible. High cell densities are absolutely necessary for high 
productivity Generally, when foreign proteins are produced by recombi¬ 
nant E. coli, the greater the final cell density, the greater the amount of 
product that is formed. In practice, cell concentrations of more than 50 (and 
in a few cases more than 150) grams (dry weight) of cells per liter of culture 
have been obtained with fed-batch cultures of recombinant E. coli. The dry 
weight of E. coli cells is approximately 20 to 25% of the wet weight. 

One way to achieve a high density of E. coli cells is to optimize the 
growth medium. Some nutrients, including carbon and nitrogen sources, 
can inhibit cell growth if they are present at too high a concentration. 
Glucose is inhibitory above 50 grams per liter, ammonia is inhibitory above 
3 grams per liter, iron is inhibitory above 1.15 grams per liter, magnesium 
is inhibitory above 8.7 grams per liter, phosphorus is inhibitory above 10 
grams per liter, and zinc is inhibitory above 0.038 gram per liter. Therefore, 
merely increasing the amount of nutrients in the growth medium in batch 
culture does not necessarily yield a high cell density. In addition, since the 
nutrients in complex media, such as peptone or yeast extract, can vary from 
one batch of medium to another, fermentations that use complex media are 
not always reproducible. 

Acetate, which can be inhibitory to cell growth, is produced by £. coli 
both when the cells are grown under oxygen-limiting conditions and in the 
presence of excess glucose. The acetate problem can be minimized by using 
glycerol instead of glucose as a carbon source, lowering the culture tem¬ 
perature, or using an £. coli strain that has been genetically engineered to 
shunt acetate into less toxic compounds (see below). 

Oxygen may also become limited in high-density cell cultures. To over¬ 
come this problem, the rate of introduction of air (sparging), the agitation 
rate, or both can be increased. Also, pure oxygen rather than air, which is 
only approximately 20% oxygen, can be introduced into growing cell cul¬ 
tures. Cells can also be grown under pressure to increase the solubility of 
oxygen, which increases the rate of transfer of oxygen to the cells in the 
aqueous growth medium. Alternatively, expression in host cells of the 
Vitreoscilla hemoglobin gene in a number of different organisms has been 
shown to significantly increase the uptake of oxygen by growing cells and 
to thereby increase the amount of product formed (see chapter 6). For 
example, Vitreoscilla hemoglobin can enhance growth and heterologous 
protein production in E. coli, improve enzyme production in Bacillus sub- 
tilis, increase erythromycin production by Saccharopolyspora erythraea , 
improve the rate of degradation of benzoic acid by Xanthomonas maltophilia, 
and enhance the production of cephalosporin C by Acremonium chry- 
sogenum. 

High-density cell cultures are most readily attained in fed-batch cul¬ 
tures. The addition of nutrients following the depletion of some of the 
original nutrients may be constant, stepwise, or exponential. With constant- 
rate feeding, nutrients are added at the same rate throughout the fermenta¬ 
tion. However, under these conditions, the specific growth rate continually 
declines. With stepwise feeding, increasing amounts of nutrients are added 
at higher cell concentrations. In this case, the specific growth rate decline is 
largely compensated for. With exponential feeding, nutrients are added at 
an exponential rate, with the result that a constant specific growth rate can 
be achieved. It is possible to automate the fed-batch addition of nutrients 
based on measuring the concentration of a growth-limiting substrate, such 
as glucose, in the culture medium during the fermentation process. 
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Increasing Plasmid Stability 

The loss of plasmids during the large-scale growth of recombinant £. coli 
cells is a major industrial problem. Plasmid loss often limits the yield of 
plasmid-encoded recombinant proteins, especially when cells are grown 
in continuous culture. Plasmid instability in bacterial cultures is typically 
a consequence of the unequal distribution of plasmids to daughter cells 
during growth and cell division. Generally, once cells have lost a plasmid, 
they grow faster, with the result that cells lacking plasmids eventually 
dominate the culture. One approach to avoid this problem is to include an 
antibiotic resistance gene on the plasmid being used and then add that 
antibiotic to the culture medium. In addition to the obvious economic cost 
of the antibiotic, especially when dealing with large-scale cultures, dis¬ 
posal of spent growth medium is a potential environmental hazard in that 
both the antibiotic and antibiotic resistance genes may be released into the 
environment. One way to get around this problem is to delete an essential 
gene from the chromosomal DNA of the host bacterium and at the same 
time place this gene on the plasmid that is being stabilized. As a result, 
only plasmid-carrying cells can grow, making the bacterial strain totally 
dependent upon maintenance of the plasmid. In one example, the essen¬ 
tial gene that was used encoded translation initiation factor 1 (Fig. 17.4). 
The target gene was placed under the transcriptional control of the strong 
and IPTG (isopropyl-(3-D-thiogalactopyranoside)-inducible p lrc promoter. 


FIGURE 17.4 Schematic representation of a bacterial cell in which the essential pro¬ 
tein synthesis initiator factor 1 gene (infA) was deleted from the chromosome and 
included on a plasmid. The target gene is under the transcriptional control of the 
strong Trc promoter (p trc ) that is controlled by the lac repressor encoded by lacP, 
which overproduces the protein. 
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With this system, selection that utilizes antibiotics is no longer necessary, 
thereby decreasing both the cost and the environmental risk associated 
with large-scale fermentation. 

Quiescent E. coli Cells 

While it is possible to achieve high levels of foreign-gene expression in £. 
coli and other bacterial host cells, it is difficult to engineer recombinant 
bacteria to produce large amounts of a foreign protein and, at the same 
time, to grow to a high cell density This is because a recombinant bacterial 
cell partitions its resources between production of the foreign protein and 
cell growth. It would be advantageous to be able to first grow cells to a high 
density and then to shift the allocation of available resources from growth 
to foreign-protein production. With this in mind, one group of workers 
engineered a quiescent cell expression system in which a plasmid-encoded 
protein is expressed in nongrowing but metabolically active cells. The qui¬ 
escent state is established by the overexpression of Red, a regulatory pro¬ 
tein, in an hns mutant E. coli host cell. The hns gene codes for a histone-like 
nucleoid-structuring protein. Cultures of the hns mutant of E. coli in which 
the red gene is induced gradually cease synthesizing host proteins but con¬ 
tinue synthesizing plasmid-encoded foreign proteins for many hours after 
induction. In one study that utilized this system, the red gene was placed 
under the transcriptional control of the p R promoter while the recombinant 
protein gene (encoding a single-chain antibody variable fragment [scFv]) 
was controlled by the p L promoter (Fig. 17.5). The activities of both the p R 
and p L promoters are repressed by a temperature-sensitive cl repressor 
protein; in this system, the gene for this protein is encoded in the host chro¬ 
mosomal DNA. When cells are grown at 30°C, the cl repressor prevents 
transcription from both p R and p L . When the temperature is shifted to 42°C, 
the temperature-sensitive cl repressor protein is inactivated, and transcrip¬ 
tion can proceed from both p R and p L (see chapter 6). The temperature shift 
therefore causes the cells to become quiescent and at the same time to syn¬ 
thesize the recombinant scFv protein. In this particular case, the scFv pro¬ 
tein contained a leader peptide that directed -90% of the protein to be 
secreted into the growth medium. As shown in Table 17.1, in both batch 
and fed-batch modes, the quiescent cells produce less biomass and secrete 
considerably more of the scFv protein into the growth medium than do 
control E. coli host cells engineered to express scFv under the control of the 
p L promoter. Understanding the commercial potential of this unique 
system, the scientists who developed this approach have applied for a 
patent to protect their intellectual property rights. 

Protein Secretion 

High-level cytoplasmic expression in £. coli of many different foreign pro¬ 
teins results in the formation of inclusion bodies consisting of insoluble 
improperly folded protein. Even when the foreign protein is soluble, puri¬ 
fying it from a cytoplasmic extract can be a major undertaking. In addition, 
sometimes proteins that are secreted into the growth medium are produced 
at a much higher level than when they are expressed in the cytoplasm. 
While these considerations make only a small difference in laboratory-scale 
experiments, they are of critical importance when foreign proteins are pro¬ 
duced on a large scale. 
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FIGURE 17.5 Synthesis of a single-chain antibody fragment (scFv) in "quiescent E. coli 
cells." At 30°C, the cl repressor, encoded by cI857 (inserted into the chromosomal 
DNA), binds to the operators of the p R and p L promoters and prevents transcription 
of the plasmid-encoded scFv and red genes. At 42°C, the temperature-sensitive cl 
repressor is inactivated so that transcription directed by the p R and p L promoters 
proceeds. Turning on the p R promoter causes the Red protein to be synthesized, 
thereby causing the cells to become quiescent. At the same time, turning on the p L 
promoter activates transcription of the gene encoding scFv. When the red gene is 
induced, a mutant of the fins gene causes the cessation of host cell protein syn¬ 
thesis. 
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TABLE 17.1 Cell growth and foreign protein (scFv) secreted into the growth medium 
for E. coli cells with induced quiescence compared to control (wild-type) cells 



Batch culture 

Fed-batch culture 

Cell type 

Cell growth 
(optical density 
at 600 nm) 

scFv secreted 
(mg/liter) 

Cell growth 
(optical density 
at 600 nm) 

scFv secreted 
(mg/liter) 

Quiescent 

3.5 

37 

20 

150 

Control 

20 

13 

80 

35 


One group of investigators observed that expression levels were quite 
low when they expressed several different foreign proteins, i.e., human 
granulocyte-macrophage colony-stimulating factor, a-interferon 2b 
(IFN-a2b), or scFv, under the transcriptional control of the strong p m /xylS 
promoter/regulator system. The yields of human granulocyte-macrophage 
colony-stimulating factor and scFv, but not IFN-a2b, increased dramati¬ 
cally when the genes encoding these proteins were fused to a translocation 
signal sequence (Fig. 17.6). Interestingly, different translocation signal 
sequences were optimally effective with each of the proteins tested. To 
obtain a high level of expression of IFN-a2b, before assembling the con¬ 
struct, it was necessary to chemically synthesize the gene in order to elimi¬ 
nate the use of codons that are rarely used in E. coli. While the use of 
translocation signal sequences significantly stimulated the levels of expres¬ 
sion of these three human proteins, depending on the protein and the 
translocation signal sequence, from 20 to 50% of the protein that was pro¬ 
duced was found to be in an insoluble form. In order for this system to be 
used routinely for the large-scale production of human proteins in E. coli, a 
strategy that minimizes the extent of insoluble protein formation needs to 
be developed. 

Reducing Acetate 

It is often difficult to achieve high levels of foreign-gene expression and a 
high cell density at the same time because of the accumulation of harmful 
waste products, especially acetate, which inhibits both cell growth and pro¬ 
tein production and also wastes carbon and energy resources. One strategy 
to reduce the inhibitory effects of acetate is to remove the acetate from the 
culture during the course of the fermentation. This may be achieved by 
several different methods, including continuous dialysis and the use of 
macroporous ion-exchange resins. However, these methods tend to remove 
nutrients that are necessary for cell growth along with the acetate. 

Since acetate is often associated with the use of glucose as a carbon 
source, lower levels of acetate, and hence higher yields of protein, are gener¬ 
ally obtained when fructose or mannose is used as a carbon source. Another 
strategy for reducing acetate accumulation in rich medium without 
impairing cell growth entails decreasing the glucose uptake rate of the cells 
by adding methyl a-glucoside, a glucose analogue, to the growing cells. The 
same effect has also been achieved by using an £. coli host cell that contained 
a mutation in ptsG, a gene encoding enzyme II in the glucose phosphotrans¬ 
ferase system. In a comparison of batch cultures of wild-type and ptsG 
mutant E. coli cells in rich medium, with both carrying a plasmid expressing 
p-galactosidase activity, the wild-type cells attained a density of approxi¬ 
mately 10 grams (dry weight) per liter, while the mutant cells attained more 
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FIGURE 17.6 Engineering proteins for high-level expression following high-cell- 
density growth. (A) A target protein gene is placed under the transcriptional control 
of the strong and inducible (in this case by m-toluic acid) p m promoter. (B) The target 
gene is fused to any one of several different secretion signal sequences. (C) Rare 
codons are removed from protein coding sequences that are not well expressed by 
chemically synthesizing the entire gene. GM-CSF, granulocyte-macrophage colony- 
stimulating factor; IFN, IFN-a2b. 


than 15 grams (dry weight) per liter. At the same time, the mutant cells syn¬ 
thesized about 25% more (3-galactosidase per gram (dry weight) of cells than 
the wild-type cells did. Overall, the ptsG mutant cells synthesized nearly 
twice as much (3-galactosidase as did the wild-type cells. 

Since it is often much easier and quicker to alter a particular host cell 
by genetic transformation than by mutagenesis and selection, alternative 
means of reducing acetate production in cells were developed. One of these 
methods includes introducing a gene (from B. subtilis) encoding the enzyme 
acetolactate synthase into E. coli host cells. This enzyme catalyzes the for¬ 
mation of acetolactate from pyruvate, thereby decreasing the flux through 
acetyl coenzyme A to acetate (Fig. 17.7). In practice, the acetolactate syn¬ 
thase genes are introduced into the cell on one plasmid, while the target 
gene (encoding the protein that is to be overexpressed in E. coli) is intro¬ 
duced on a second plasmid from a separate incompatibility group. The 
cells that were transformed with the acetolactate synthase genes produced 
75% less acetate than the nontransformed cells and instead synthesized 
acetoin, which is approximately 50-fold less toxic to cells than acetate. The 
protein yield was also doubled. 

An alternative strategy to converting acetate to acetoin is to redirect 
carbon flow to the tricarboxylic acid (TCA) cycle (citric acid cycle, or Krebs 
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FIGURE 17.7 Schematic representation of the pathways for glucose metabolism in an 
E. coli strain that has been transformed with a plasmid carrying the genes for the 
protein subunits of acetolactate synthase (ALS). Note that the conversion of glucose 
to biomass is a multistep process. CoA, coenzyme A. 


cycle). This is necessary because recombinant protein production decreases 
carbon flow in the TCA cycle as a consequence of the withdrawal of the 
intermediates that serve as protein precursor biochemicals. In one study, 
workers overexpressed the gene for the enzyme phosphoenolpyruvate 
carboxylase, which converts phosphoenolpyruvate to oxaloacetate, with 
the result that they obtained a 17% increase in the specific growth rate of 
the E. coli cells and a 44% decrease in acetate production. Unfortunately, 
overexpressing this enzyme also decreases the amount of glucose uptake 
by the bacterial cells and diminishes the growth rate. As an alternative 
approach to replenishing the TCA cycle, another group of researchers 
transformed £. coli host cells with the gene for the enzyme pyruvate car¬ 
boxylase, which converts pyruvate directly to oxaloacetate (Fig. 17.8). Since 
£. coli does not normally contain pyruvate carboxylase, the gene was iso¬ 
lated from a strain of the gram-negative bacterium Rhizobium etli. With the 
introduction of pyruvate carboxylase, acetate levels were decreased, the 
cell yield was increased, and the amount of foreign protein synthesized 
was increased (Table 17.2). This result reflects the fact that the addition of 
pyruvate carboxylase allows £. coli cells to use the available carbon more 
efficiently, directing it away from acetate toward biomass and protein for¬ 
mation. Although it has not been tested extensively, it is thought that this 
strategy may be a generally effective method for increasing the level of 
expression of foreign proteins produced in £. coli host cells. 

Similar to the strategy discussed above, the TCA cycle may also be 
replenished by converting aspartate to fumarate (Fig. 17.8). To do this, £. coli 
host cells were transformed with the gene for L-aspartate ammonia lyase 
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FIGURE 17.8 Replenishment of the TCA cycle in E. coli by the introduction of a gene 
from R. etli encoding pyruvate carboxylase. This avoids the conversion of pyruvate 
to acetate. The TCA cycle may also be replenished by the introduction of a gene 
encoding aspartase, converting aspartate in the medium to fumarate. Note that the 
conversion of glucose to biomass is a multistep process. CoA, coenzyme A. 


(aspartase) under the control of the strong tac promoter on a stable low- 
copy-number plasmid. Aspartase activity is induced by the addition of 
IPTG at the mid- to late log phase of growth. The target recombinant protein 
is introduced on a separate plasmid. Using this system in minimal medium 
containing aspartate, the production of different recombinant proteins could 
be increased up to fivefold, with 30 to 40% more biomass production. 


Bioreactors 

A cursory examination of the biochemical engineering literature may give 
the impression that there are a limitless number of bioreactor designs. 
However, closer inspection reveals that virtually all of these designs fall 
into three fundamental classes: 

• Stirred-tank reactors (STRs), which have internal mechanical agita¬ 
tion (Fig. 17.9 A) 

• Bubble columns, which rely on the introduction of air or another gas 
(sparging) for agitation (Fig. 17.9B) 
















702 CHAPTER 17 


• Airlift reactors, which have either an internal (Fig. 17.9C) or an 
external (Fig. 17.9D) loop; the mixing and circulation of the culture 
fluid in these reactors are the results of the motion of an introduced 
gas (usually air), which causes density differences within the dif¬ 
ferent parts of the bioreactor. 

The traditional, and by far the most commonly used, bioreactor is the 
STR. This type of bioreactor has several advantages over other bioreactor 
configurations. 

• It has highly flexible operating conditions. 

• It is readily available commercially. 

• It provides efficient gas transfer to the growing microbial cells, or, in 
the words of fermentation engineers, the volumetric mass transfer 
coefficient, k L a, of STRs is high. 

• It has been used extensively by fermentation engineers and microbi¬ 
ologists for growing a variety of microorganisms. 

In an STR, gas, usually air, is added to the culture medium under pres¬ 
sure through a device called a sparger, which can be either a ring with 
many small holes or a tube with a single orifice. Although sparging rings 
generate smaller bubbles and consequently create better initial gas distri¬ 
bution, sparging tubes are often preferred in many small-scale applications 
(<20 liters) because they are less likely to become plugged. Thorough dis¬ 
persion of the gas within the bioreactor requires one or more impellers 
(agitators) in addition to the sparger. Mechanical agitation of the culture 
medium by the impellers breaks larger bubbles into smaller ones, disperses 
the bubbles throughout the medium, and enhances the residence time of 
the bubbles in the bioreactor. At high levels of agitation, the mean size of 
the bubbles in large bioreactors is essentially independent of the size of the 
holes in the sparger. The type of impeller, its rotational speed, and the 
physicochemical properties of the liquid phase are important factors that 
give rise to efficient gas dispersion. In large bioreactors, however, if the 
initial gas distribution from the sparger is not uniform across the tank, even 
vigorous agitation may not create a homogeneous gas environment. 

Because of the corrosive or abrasive nature of many culture media and 
sterilization procedures, STRs are usually constructed from stainless steel 
or glass. The glass units are usually limited to laboratory-scale bioreactors 
that have a capacity of <50 liters. 

One limitation on the size of a bioreactor is the ability of the system to 
efficiently remove heat that is generated as a consequence of either the 
metabolism of the growing cells or the energy input by agitation. Too much 
heat raises the temperature and alters the physiological state of the cells. 


TABLE 17.2 Relative acetate levels, cell yields, and foreign-protein activities from 
E. coli cells transformed with an R. etli pyruvate carboxylase gene 


E. coli strain 

Acetate 

Cell yield 

Foreign-protein 


concentration 


activity 

- Pyruvate carboxylase 

1.0 

1.0 

1.0 

+ Pyruvate carboxylase 

0.43 

1.41 

1.68 


Adapted from March et al., Appl. Environ. Microbiol. 68:5620-5624, 2002. 

The foreign protein was (3-galactosidase, whose activity is relatively easy to quantify. The data have 
been normalized to the values for the E. coli strain without (-) pyruvate carboxylase. 
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FIGURE 17.9 Simplified examples of bioreactor configurations. (A) STR; (B) bubble 
column; (C) internal-loop airlift reactor with a central draft tube; (D) external-loop 
airlift reactor. The arrows within the bioreactors indicate the direction of flow of the 
culture medium. 


and decreases the product yield. Heat can be removed by using a cooling 
jacket around the reaction vessel or by internal coils. Although internal 
cooling coils are more effective than jackets in keeping the fermentation 
reaction close to the desired temperature, they can become fouled (coated) 
with microorganisms, which prevents cooling, and they sometimes inter¬ 
fere with the proper agitation of the fermentation broth. 

Contamination of a fermentation with fungi or bacteria is usually 
disastrous. Therefore, bioreactors are designed so that they can be steril¬ 
ized, usually with pressurized steam. There should be no internal dead 
spaces or surfaces that escape contact with the steam during sterilization. 
All seals, probes, and valves also must be readily steam sterilizable. When 
a bioreactor is designed, a trade-off is often made between a full set of ports 
for probes, which enables monitoring of the fermentation parameters, and 
fewer ports, which makes the maintenance of sterility easier. 

The high level of agitation of the culture medium during a fermenta¬ 
tion reaction often causes considerable foaming. Excessive foam can wet 
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the filter at the port through which the introduced air exits the bioreactor, 
thereby both decreasing the air flow and providing a potential pathway for 
the entry of contaminating cells. Either chemical antifoaming agents or 
mechanical foam breakers can be used to control foaming. However, the 
chemical agents can diminish the extent of microbial growth by preventing 
oxygen transfer or, in some cases, by inhibiting cellular enzymes. 
Furthermore, if an antifoaming compound is not removed before down¬ 
stream processing, it can contaminate the final product. Foaming can also 
be controlled by providing sufficient "head space" within the bioreactor, 
i.e., space above the liquid in which the bubbles can dissipate. In practice, 
then, the "working volume," or actual volume of the culture, in an STR is 
typically only about 75% of the total volume of the bioreactor. 

Many of the considerations that apply to STRs also apply to pneumatic 
reactors, such as bubble columns and airlift bioreactors. Thus, for example, 
sterility, constant pH, and constant temperature are key components of any 
fermentation, regardless of the precise configuration of the bioreactor. 

The configurations of bubble columns and airlift bioreactors give them 
some distinct advantages over STRs. These pneumatic reactors are more 
energy efficient than STRs because agitation is provided by the injection of 
a stream of air—or another gas if anaerobic microorganisms are being 
grown—rather than by a mechanical stirrer. Also, with the elimination of 
the mixer shaft in these units, there is one less potential site of entry for 
contaminating organisms. 

Pneumatic reactors generate a lower-shear environment than do STRs. 
Also, in airlift reactors, the shear stress is more evenly distributed 
throughout the vessel than in STRs. The reduction of shear forces is impor¬ 
tant for the following reasons. 

• Genetically engineered microorganisms are often more susceptible 
to lysis when exposed to shear stress than are unmodified organ¬ 
isms, because the extra metabolic burden of synthesizing a foreign 
protein often causes genetically engineered microorganisms to form 
weakened cell walls. 

• A frequent cellular response to the shear forces is decreased syn¬ 
thesis of all cellular proteins, including the recombinant protein. 

• Shear stress can alter the physical and chemical properties of the 
cells so that the downstream processing steps become more difficult 
to perform. For example, the fermentation conditions can inadver¬ 
tently increase the amount of surface polysaccharides that a micro¬ 
organism produces and, as a consequence, can change the conditions 
for effective harvesting and lysis of these cells, making it more dif¬ 
ficult to purify the target protein. 

In bubble columns, the air is introduced under high pressure near the 
bottom, but the smaller bubbles coalesce into larger ones as they rise 
through the column, leading to uneven gas distribution. In addition, the 
use of high-pressure air tends to cause excessive foaming of the medium. 
These disadvantages restrict the flexibility or effective range of operating 
conditions, as well as the potential size of bubble columns. 

Airlift bioreactors, however, can be readily adapted for either pilot plant 
or large-scale fermentation processes. In an airlift reactor, the gas is intro¬ 
duced into the bottom of a vertical channel (riser). Both the gas and liquid 
flow up the riser until they reach an open space at the top (gas-liquid sepa¬ 
rator), where the gas is at least partially disengaged from the liquid. The 
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degassed liquid, which is denser than the gassed liquid, descends in a sepa¬ 
rate vertical channel (downcomer) and moves along the base of the reactor 
until it reaches the bottom of the riser. In this way, the culture fluid and cells 
are continuously being circulated around the bioreactor. 

There are two main types of airlift bioreactors: those that have a single 
container with internal baffles that create interior liquid circulation chan¬ 
nels (internal-loop reactors) (Fig. 17.9C) and those that have an external 
loop so that the culture liquid circulates through separate, independent 
channels (external-loop reactors) (Fig. 17.9D). Internal-loop airlift reactors 
are simple in design, but once they are constructed, both the volume and 
the circulation rate are fixed for all fermentation processes. In contrast, the 
external loop of external-loop airlift reactors can be easily changed or 
modified, e.g., by altering its volume, to suit the requirements of different 
fermentations. 

Airlift bioreactors are generally more efficient than bubble columns, 
especially for denser or more viscous suspensions of microorganisms. In 
airlift reactors, mixing is generally better, and bubble coalescence is not as 
big a problem as it is in bubble columns. In extremely large airlift fer¬ 
menters, such as the 1,500,000-liter fermenter built by Imperial Chemical 
Industries, Ltd., in England for the production of single-cell protein, it takes 
a considerable amount of time for a cell to complete a full cycle through the 
reactor. To prevent the cells from becoming substrate depleted while they 
are traversing the bioreactor, there are multiple injection points for the 
introduction of substrate along the length of the unit. 


Typical Large-Scale Fermentation Systems 

When recombinant microorganisms are used to overproduce protein prod¬ 
ucts, such as pharmaceuticals like insulin, cells are typically grown to mid- 
to late log phase of the growth cycle so that the target protein levels will be 
optimal. On the other hand, when recombinant microorganisms are used 
as "factories" to synthesize useful metabolites, such as antibiotics, host cells 
are commonly grown to the deceleration or stationary phase, where the 
synthesis of secondary metabolites is often optimal. Clearly, these sorts of 
differences must be considered when a large-scale fermentation process is 
being developed. 

For maximal protein production, it is generally preferable to use cloned 
genes that are under the control of strong promoters that can be regulated. 
Initially, it was thought that constitutive expression of a cloned gene would 
be sufficient to obtain reasonable quantities of the product. However, expe¬ 
rience has shown that continuous transcription and translation of a cloned 
gene drains energy from essential cell functions and slows cell growth. 
With an inducible system, the expression of a cloned gene can be confined 
to a specific period in the growth cycle of the microorganism. On these 
grounds, for optimal protein production, the process should consist of two 
separate stages. First, the cells are grown under optimum conditions to a 
relatively high cell density. Second, depending on the nature of the pro¬ 
moter that drives the cloned gene, transcription is induced either by 
shifting the temperature or by adding a chemical inducer, such as IPTG, to 
the medium. 

A two-stage system is not easy to implement in a large bioreactor (>100 
liters) because it is technically very difficult either to raise the temperature 
quickly, typically from 30 to 42°C, or to ensure that the chemical inducer is 
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rapidly and evenly mixed in a large vessel. Moreover, many chemical 
inducers, such as IPTG, are too expensive to be used on a large scale. 
However, as discussed below, this problem can be overcome by using two 
connected bioreactors (two-stage fermentation) so that the cells are grown 
in the first vessel and the induction is carried out in the second. Under 
these conditions, growth and induction are separately optimized, thereby 
increasing the overall amount of product formed per unit of time (produc¬ 
tivity) of the fermentation. 

Two-Stage Fermentation in Tandem Airlift Reactors 

E. coli NM989, which carries the gene encoding the enzyme T4 DNA ligase 
under the transcriptional control of the p L promoter and a temperature- 
sensitive cl repressor, was grown and induced in a two-stage airlift biore¬ 
actor. In this bacterial strain, the T4 DNA ligase gene was integrated in the 
chromosomal DNA, a location obviating any potential problems of plasmid 
instability that might occur during extended fermentation. The growth 
stage was carried out at 30°C in an external-loop airlift bioreactor that had 
a 10-liter working volume. The T4 DNA ligase gene was not expressed 
under these conditions. A second external-loop airlift bioreactor, with a 
working volume of approximately 5 liters at 42°C, was used for the induc¬ 
tion stage (Fig. 17.10). The two bioreactors were linked by a transfer tube 
with a pump that controlled the continuous flow of cell suspension from 
the growth stage bioreactor (first) into the induction stage vessel (second). 
In addition, cells suspended in culture medium were removed from the 
induction stage bioreactor at a specific rate and prepared for downstream 
processing. 

The maximal specific growth rate (p max ) of the microbial culture was 
approximately 0.66 reciprocal hour in the first bioreactor and 0.54 recip¬ 
rocal hour in the second. These values correspond to cell doubling times of 
63 and 77 minutes, respectively. Fresh medium was continuously added at 
a rate of 2 liters per hour to the growth stage fermenter, and cell suspension 
(effluent) was simultaneously removed from the induction stage fermenter 
at the same rate. As a consequence of the liquid volumes of the two bioreac¬ 
tors, an average cell spent about 5 hours in the growth stage bioreactor and 
2 hours in the induction stage bioreactor. The different residence times in 
the two phases of this fermentation process were necessary to optimize the 
number of cells produced and the yield and stability of the T4 DNA ligase. 
Generally, residence times can be altered as required by adjusting the rela¬ 
tive working volumes in the two fermenters of a two-stage system and by 
adjusting the volumetric rate of input of nutrients into the first bioreactor. 

The double-external-loop design of the airlift fermenter (Fig. 17.10) 
used in this work facilitated the adjustment of the working volumes of the 
two fermenters relative to one another. It also added versatility to the 
system, so that it was possible to obtain a variety of different growth condi¬ 
tions for different populations of recombinant cells. For T4 DNA ligase 
production, the best results were obtained when approximately 33 mL of 
cell suspension was transferred every minute from the growth stage biore¬ 
actor to the induction stage bioreactor. Because this amount of cell suspen¬ 
sion was equivalent to only 0.67% of the volume of the induction stage 
bioreactor, the incoming cells underwent a virtually instantaneous tem¬ 
perature shift from 30 to 42°C. Nutrients in a concentrated form were 
added at a specific rate to the induction stage bioreactor throughout the 
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FIGURE 17.10 Two-stage airlift reactor used for the temperature-dependent induction 
of a protein product. Cells from the growth stage (left) fermenter, which is at 30°C, 
are pumped into the induction stage fermenter (right), which is at 42°C. Each bio¬ 
reactor has a double external loop that is fitted with valves. By changing the valve 
settings, working volumes best suited for different fermentations can be created. 


fermentation to keep the cells in this bioreactor in log phase. This action 
prevented the T4 DNA ligase from being degraded by the proteolytic 
enzymes that are normally synthesized during the deceleration and sta¬ 
tionary phases. 

With this continuous two-stage bioreactor, induced E. coli NM989 can 
be grown to a density of approximately 4 grams (dry weight) of cells per 
liter of culture. After induction, about 4% of the cell protein is T4 DNA 
ligase, an amount that corresponds to approximately 25,000 units of 
enzyme activity per gram (dry weight). This process can produce approxi¬ 
mately 100,000 units of enzyme activity per liter of culture, or about 
4,800,000 units per day. Assuming that it is possible to recover about 20% 
of the initial activity following purification of the enzyme and that the 
enzyme sells for about $0.25 per unit, then the final yield of purified 
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enzyme in 1 day will be worth about $240,000. Although these calculations 
do not consider all of the costs that go into the production of protein from 
a genetically engineered microorganism, it is clear that for highly valued 
products, small to moderate-size continuous fermentation systems can 
generate significant returns on the initial capital investment. 

Two-Stage Fermentation in a Single Stirred-Tank Reactor 

The tripartite fusion protein AG (3-Gal, which is used for immunological 
assays, was produced on a large scale in a single STR. The gene encoding 
AG(3-Gal was constructed by recombinant DNA techniques and encodes 
the five immunoglobulin G-binding regions of Staphylococcus aureus pro¬ 
tein A, two immunoglobulin G-binding regions from Streptococcus sp. 
strain G148 protein G, and (3-galactosidase from E. coli. The synthetic 
AG(3-Gal gene was placed under the control of the bacteriophage X p R pro¬ 
moter, which is regulated in the same manner as the p L promoter; cloned 
into a plasmid that carries the gene for ampicillin resistance; and intro¬ 
duced by transformation into £. coli. The strain with the AG(3-Gal plasmid 
carries a second plasmid that has the genes for a temperature-sensitive cl 
repressor protein and a kanamycin resistance gene. 

A 5-liter volume of these cells was grown in an STR at 30°C in the pres¬ 
ence of both ampicillin and kanamycin—to provide selective pressure for 
the retention of both plasmids—and then used to initiate growth without 
antibiotics at 30°C in a 45-liter STR. The cell suspension in the 45-liter fer¬ 
menter in turn served as an inoculum for a 600-liter STR, where the cells 
were grown at 30°C without antibiotics (Fig. 17.11). In general, to keep the 
cost of the process to a minimum, antibiotics are not added to large-scale 
microbial cultures. When the cell density in the 600-liter bioreactor reached 
the equivalent of about 4 grams (dry weight) per liter of growth medium, 
the temperature was shifted from 30 to 40°C to induce the expression of the 
AG(3-Gal protein. Under these conditions, it takes about 1 hour to reach 
40°C. A temperature of 40°C rather than 42°C was chosen because the 
lower temperature was found to yield the same level of AG(3-Gal protein 
while allowing the cells to grow for a longer time. In other words, the low- 
temperature (40°C) conditions yielded a larger amount of protein product. 

The specific activity of the AG (3-Gal protein increased for 2 hours after 
the initiation of the induction and then decreased. This decrease in activity 
was probably due to the synthesis of proteases by cells that had entered the 
deceleration and stationary phases of the growth cycle. In addition, 50% of 
the cells had lost their plasmids after growth for 4 hours at 40°C. These 
problems notwithstanding, after 4 hours at 40°C, the AG(3-Gal protein was 
approximately 20% of the dry weight of the total biomass. Considering the 
very high yield of the target protein that was produced by this strategy, it 
is probably not necessary to integrate the genes for the AG(3-Gal protein 
and the cl repressor into the chromosomal DNA of the £. coli host cell in an 
effort to increase the final yield. 

Batch versus Fed-Batch Fermentation 

In some instances, a simple fed-batch strategy can be used to produce both 
a high cell density and a high level of expression of a target protein (Fig. 
17.12). For example, a plasmid carrying a gene encoding a hybrid protein 
that includes the insulin B peptide under the control of the £. coli trp pro¬ 
moter was introduced into a frp-minus mutant strain of £. coli that cannot 
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FIGURE 17.11 Scheme for the large-scale production of the protein AG(3-Gal. The vol¬ 
umes in parentheses indicate how much culture medium was used at each step of 
the process. The culture medium occupied about 60 to 75% of the total volume of 
each of the bioreactors (fermenters). 


synthesize tryptophan, and the transformant was grown in media con¬ 
taining various amounts of tryptophan. At high levels of tryptophan, the 
synthesis of the target protein was repressed. However, after consumption 
of the tryptophan in the medium by the growing cells, synthesis of the 
target protein was induced. With this system, the addition of tryptophan to 
the medium resulted, in batch cultures, in increases in the amounts of both 
biomass and target protein produced. However, fed-batch fermentation 
was more effective than batch fermentation with or without added trypto¬ 
phan (Table 17.3). 

In another experiment, IFN-y was produced in E. coli. Expression of the 
IFN-y gene was controlled by the p L promoter regulated by the tempera- 
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FIGURE 17.12 Schematic representation of the amount of foreign protein produced as 
a function of time by recombinant E. coli following induction (arrow) in the mid-log 
phase of growth. Prior to induction, no foreign protein is synthesized. After induc¬ 
tion in batch mode, in the absence of additional nutrients, the cells soon enter sta¬ 
tionary phase and synthesize proteases that degrade the foreign protein product. 
After induction in fed-batch mode, the added nutrients ensure that the cells remain 
in log phase for an extended time and do not produce any proteases until 1.5 to 2 
hours later than the cells in batch mode; therefore, the foreign protein is more stable 
and is easier to recover. In addition, the provision of nutrients in fed-batch mode 
makes it less likely that plasmids carrying foreign genes will be lost than in batch 
mode. The time represents the number of hours from the start of the fermentation. 


ture-sensitive cl repressor. Cells that contained this construct on a plasmid 
were grown in either batch or fed-batch mode (with both stepwise and 
constant-rate medium-feeding strategies). In the fed-batch mode, the addi¬ 
tion of growth medium was carried out simultaneously with the tempera¬ 
ture induction of the p' promoter in the late exponential growth phase. Use 
of the fed-batch mode resulted in a significant increase in the length of the 
cell growth phase following induction. This fed-batch strategy enabled 
researchers studying this system to achieve a cell biomass that was 5-fold 
higher and a final IFN-y concentration that was 23-fold higher than they 
were able to achieve in batch culture. 

Another group studied the fermentation conditions that yielded the 
optimum expression of a monoclonal antibody (Fab) fragment directed 


TABLE 17.3 Comparison of batch and fed-batch fermentations for the production of 
a fusion protein including the insulin B peptide 



Yield 

in fermentation system: 


Batch 

Batch + Trp 

Fed batch 

Biomass (g [DW]/liter) 

6.7 

12 

20 

Fusion protein total protein (%) 

4.6 

7.9 

11 

Total amount of fusion protein (g/liter) 

0.17 

0.53 

1.21 

Plasmid-bearing cells (%) 

86 

62 

90 


Adapted from Gosset et al., Appl. Microbiol. Biotechnol. 39:541-546,1993. DW, dry wieght. In the "Batch 
+ Trp" fermentation, 0.1 g of tryptophan was added. In the fed-batch fermentation, 0.1 g of tryptophan was 
added every 2 hours, for a total of five times during the course of the 10-hour fermentation. A larger 
amount of tryptophan added to the batch fermentation did not increase the amount of either biomass or 
target protein produced. 
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against tetanus toxoid under the transcriptional control of the E. coli lac 
promoter. The plasmid construct included signal sequences that were 
inserted immediately upstream of the antibody (light and heavy) genes to 
target the antibody fragments to the E. coli periplasm. In this case, inexpen¬ 
sive lactose, rather than the considerably more expensive IPTG, could be 
used to induce the expression of the monoclonal antibody fragment gene, 
provided that the host strain of £. coli was Lac + and therefore able to take 
up lactose from the medium and convert it to glucose and galactose. In fact, 
allolactose, an isomer of D-lactose (and not D-lactose itself), is the actual 
inducer of the £. coli lac operon and is formed only if the host strain of £. 
coli contains a small amount of (3-galactosidase (see chapter 6). An impor¬ 
tant facet of these experiments is that lactose not only acts as an inducer, 
but also is metabolized, providing an additional carbon source for the Fab- 
producing £. coli cells, thereby supporting both cell growth and product 
accumulation (Fig. 17.13). Again, in these experiments, fed-batch fermenta¬ 
tion was clearly superior to batch fermentation, yielding both a larger 
amount of cell biomass and a greater concentration of Fab fragment. 

Fed-batch fermentation strategies have also been successfully employed 
in the production of nonprotein products, such as poly(3-hydroxybutyrate) 
and poly(3-hydroxybutyrate-co-3-hydroxyvalerate), biopolymers with 
plastic-like properties (see chapter 13). In this case, the £. coli host cells car¬ 
ried a plasmid that contained polyhydroxyalkanoate biosynthesis genes 
from the bacterium Alcaligenes latus. By using an inexpensive growth 
medium, such as whey (a waste by-product of cheese making that consists 
mainly of lactose), to produce these polymers, it is hoped that a commer¬ 
cially viable product can be produced on a large scale. 

Another group reported that instead of a single introduction of lactose 
in a fed-batch fermentation, to produce active protein fragments of human 
apolipoprotein(a), a strategy that employed continuous induction with 
lactose beneficially influenced the expression of the target protein. With a 
75-liter fermenter, using lactose as the sole feed was not efficient for cell 
growth, presumably because the host strain of £. coli was unable to metab¬ 
olize galactose. However, with a 1:50 ratio of lactose to glycerol, the target 
protein reached 16% of the total cellular protein. It remains to be seen 
whether continuous lactose induction will benefit the expression of pro¬ 
teins other than human apolipoprotein(a) and to what extent this process 
can be scaled up. 

Harvesting Microbial Cells 

The first step in the process of purification of a product that is synthesized 
during microbial fermentation is the separation of the cells from the culture 
medium. Recombinant and native microbial cells can both be harvested 
with the same type of equipment. However, as a consequence of physiolog¬ 
ical changes, such as alterations in cell size and the production of extracel¬ 
lular polysaccharides, conditions that have been established for 
nontransformed cells may not be optimal for recombinant cells expressing 
a foreign protein. 

For large volumes, either high-speed centrifugation, which is the cur¬ 
rent method of choice, or membrane microfiltration is used to separate cells 
from the growth medium. High-speed semicontinuous centrifuges have 
been specifically designed for the harvesting of microbial cells. The cell 
suspension is continuously fed into a running centrifuge, and the cells are 
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concentrated within it while the clarified medium is collected in an external 
container. When the centrifuge chamber is full of packed cells, the run is 
stopped and the cells are removed. The need to stop and then restart the 
procedure periodically, especially when large volumes are being processed, 
can be a major inconvenience. Furthermore, the cost of both the equipment 
and the power to run it, the release of the microorganisms into the air as 
aerosols during the harvesting procedure, and the difficulty in removing all 
the microbial cells from the spent medium also limit the use of this separa¬ 
tion procedure. 

Membrane filtration is an alternative method of separating the cells 
from the culture medium. Unfortunately, with traditional (dead-end) filtra¬ 
tion, the microbial cells accumulate on the surface of the polymeric mem- 


FIGURE 17.13 Induced recombinant protein production with batch and fed-batch E. 
coli cells grown to late logarithmic phase. The cells were either not induced, induced 
with 0.05 mM IPTG, or induced with 2.0 grams of lactose per liter. (A) In batch 
culture, the cells that were induced with lactose, which acts as both an inducer and 
a growth substrate, grew to a significantly greater extent than either the nonin- 
duced or the IPTG-induced cells. With fed-batch cells, the lesser extent of growth of 
induced versus noninduced cells probably reflects the resources that the induced 
cells direct to the synthesis of the Fab antibody fragment. (B) Under both batch and 
fed-batch conditions, more of the Fab antibody fragment was produced when the 
cells were induced with lactose than with IPTG. Fab antibody fragment synthesis in 
the absence of inducer represents incomplete repression of the lac promoter. 






































Large-Scale Production of Proteins from Recombinant Microorganisms 713 


A 


Cell suspension 



Cells 


Membrane 


Cell-free culture medium 


B 



Cell-free culture medium 

FIGURE 17.14 Membrane filtration systems for concentrating microbial cells. (A) 
Dead-end filtration; (B) cross-flow filtration. The arrows within each unit show the 
direction of the liquid flow. 


brane filter. Consequently, the flow rate of spent medium through the 
membrane decreases rapidly (Fig. 17.14A). Increasing the pressure on the 
membrane enhances the flow for a short time; however, the cells still accu¬ 
mulate on the surface of the membrane and may even form a more compact 
and less permeable layer as a result of the pressure. 

An alternative filtration technique entails passing the cell suspension 
at a high speed across the surface of the membrane (cross-flow filtration) 
(Fig. 17.14B). Under these conditions, only a very small fraction of the 
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circulating liquid actually goes through the membrane in any one pass. 
The remaining cell suspension acts to sweep the membrane clean of accu¬ 
mulated cells, so that the rate of liquid flow through the membrane does 
not decrease as rapidly as it does in dead-end filtration. In both dead-end 
and cross-flow filtration, the average pore size of the membranes is 0.2 to 
0.45 pm. After many cycles in a cross-flow filtration system, almost all of 
the culture medium will have passed through the membrane. The use of 
cross-flow filtration is generally limited to laboratory-scale operations. 
Most industrial-scale operations still rely on centrifugation. 

The next step in the purification process depends on both the nature 
and the location of the product. If the final product is a protein and it is in 
the culture medium, the medium is concentrated, often by ultrafiltration, 
and the target protein is purified by column chromatography or other stan¬ 
dard procedures. If the product is a low-molecular-weight compound in the 
culture medium, it can be purified by the appropriate extraction procedures. 
Finally, if the product is in the cellular fraction, the cells must be disrupted 
(lysed) before the steps leading to product purification are initiated. 


Disrupting Microbial Cells 

A large number of chemical, biological, and physical methods have been 
developed for disrupting microbial cells. All of these procedures represent 
a compromise, because they must be vigorous enough to break the micro¬ 
bial cell walls yet gentle enough to ensure that the protein product is not 
denatured. There is no single set of conditions for cell wall lysis because the 
cell walls of diverse microbial species are composed of different polymers. 

• In gram-positive bacteria, the cell wall is external to the cytoplasmic 
membrane and consists of a thick peptidoglycan layer of 
iV-acetyIglucosamine and N-acetylmuraminic acid residues cross- 
linked by oligopeptides. 

• The cell wall of gram-negative bacteria has an outer membrane, a 
thin peptidoglycan layer, and a cytoplasmic membrane. 

• The yeast cell wall is composed of a thick layer of partially phospho- 
rylated mannans and (3-glucan. 

Cell wall composition and strength depend on culture conditions, the 
cellular growth rate, the phase of the growth cycle when the cells are har¬ 
vested, how the concentrated cells are stored, and whether the isolated 
microorganism was expressing a cloned gene. All of these factors affect the 
cells' susceptibility to disruption. 

The chemical methods that disrupt microbial cell walls include treat¬ 
ment with alkali, organic solvents, or detergents. If the protein product is 
stable at pH values from about 10.5 to 12.5, then bacterial cell lysis can 
easily be carried out on a large scale at low cost. For example, recombinant 
human growth hormone is efficiently released from E. coli by treatment 
with sodium hydroxide at pH 11. Few, if any, viable cells remain after alkali 
treatment, which obviates concerns about the inadvertent release of a 
genetically engineered microorganism from a production facility. Treatment 
with an organic solvent is a simple and inexpensive way to disrupt cells 
and has been used for the isolation of enzymes from yeasts. However, pre¬ 
liminary tests must be run to make sure that the proposed treatment does 
not denature the target protein. Detergents permeabilize bacterial cells by 
solubilizing cell membranes and membrane proteins. As a consequence of 
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this activity, holes are formed, and proteins and other molecules are 
released from the cells. Unfortunately, detergents are expensive, frequently 
denature the protein product, and are often retained as contaminants 
throughout the purification process. 

The major biological method for disrupting microbial cells is enzy¬ 
matic lysis. For example, the cell walls of gram-positive bacteria are readily 
hydrolyzed by the enzyme lysozyme, which is isolated from egg whites; 
the cell walls of gram-negative bacteria are hydrolyzed by lysozyme and 
the metal-chelating agent ethylenediaminetetraacetic acid (EDTA); and the 
cell walls of yeasts are hydrolyzed by combinations of one or more of the 
following enzymes: (3-1,3-glucanase, (3-1,6-glucanase, mannanase, and chi- 
tinase. Enzymatic treatments are highly specific, and the conditions for 
lysis are mild. Currently, cost considerations limit the use of enzymes as 
cell lysis agents. However, the use of genetically engineered microorgan¬ 
isms for large-scale production of the enzymes that attack cell walls should 
make them less expensive. 

Microbial cells can be physically disrupted either by nonmechanical 
methods, which include osmotic shock and repeated cycles of freezing and 
thawing, or by mechanical procedures, such as sonication, wet milling, 
high-pressure homogenization, and impingement. Generally, after treat¬ 
ment by a nonmechanical method, many of the cells remain intact. In con¬ 
trast, mechanical disruption is highly efficient, which makes it the preferred 
choice. A sonicator that generates high-pressure sound waves that cause 
cell disruption by shear and cavitation (production of internal holes) is 
generally useful for small volumes. 

Wet milling is quite commonly used for disrupting large quantities of 
cells (Fig. 17.15A). A concentrated cell suspension is pumped into the 
chamber of a high-speed agitator bead mill that is filled with an inert abra¬ 
sive material, such as small glass beads (<1 mm in diameter) and is fitted 
with a central shaft that has a number of attached blades. When the device 
is turned on and the blades are put in motion, most of the cell disruption 
occurs as a consequence of the shear forces generated by the high-speed 
motion of the glass beads. Optimized cell disruption depends on both the 
number and configuration of the agitator disks, the agitator speed, the 
size of the glass beads, the number of glass beads, the cell concentration, 
the geometry of the grinding chamber, and the temperature. Bead mills 
have been successfully used to disrupt a wide range of different kinds of 

FIGURE 17.15 Schematic representation of three methods of mechanical cell disrup¬ 
tion of microbial cells. (A) Wet milling; (B) high-pressure homogenization; (C) 
impingement. 
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Buffer tank Retentate 



FIGURE 17.16 Schematic representation of large-scale ultrafiltration dialysis of a pro¬ 
tein solution. The solution from the feed tank is pumped across the ultrafiltration 
membrane, with only a small fraction of the solution actually passing through the 
membrane and the rest being used to sweep the membrane clean of protein. The 
volume that passes through the membrane is matched by an equal volume of buffer 
added to the system. A membrane with a nominal molecular mass cutoff below the 
size of the target protein is used. For example, a 10-kDa cutoff membrane might be 
used to retain a 30-kDa protein while removing salt from the solution. Following 
dialysis, the target protein is found, in a dilute solution, in the feed tank. This solu¬ 
tion may be concentrated by ultrafiltration using an identical setup except that no 
buffer tank is used. The arrows indicate the direction of liquid flow. The relative 
volume of permeate (liquid that passes through the membrane) compared with the 
retentate (retained liquid) is controlled by adjusting the permeate and retentate 
valves; this helps to keep the membrane relatively free of protein that might other¬ 
wise clog its pores. 


microbial cell types and can readily break recombinant, as well as nonre¬ 
combinant, cells. 

In the high-pressure homogenization process (Fig. 17.15B), concen¬ 
trated cells are pumped into a valve assembly under high pressure, and 
the pressure is then rapidly decreased, causing the cells to lyse. This pro¬ 
cess can be customized for different microorganisms and protein products 
by changing the operating pressure, the design of the valve, the tempera¬ 
ture of the cell suspension, or the number of times the cell mass is 
treated. 

Impingement (Fig. 17.15C) is a cell disruption procedure in which a 
high-velocity stream of suspended cells under pressure hits either a sta¬ 
tionary surface or a second fluid stream of suspended cells. The forces that 
are created at the point of contact disrupt the cells. With a device called a 
Microfluidizer, for example, two parallel streams of E. coli cells in suspen¬ 
sion are directed toward one another. With this device, a high percentage 
of the cells are disrupted by a single passage through the unit. However, 
additional passages may be required for complete breakage of other cell 
types. Unlike high-pressure homogenizers and high-speed agitator bead 
mills, which generally require highly concentrated cell suspensions, this 
device can be used to disrupt cells in either dilute or concentrated prepara¬ 
tions. The activities of cellular proteins are not significantly impaired by the 
technique. When the cell suspension is pretreated with low levels of 
lysozyme and then disrupted in the Microfluidizer at much lower than 
normal pressure and fluid velocity, the activities of labile proteins, which 
might be otherwise inactivated by the high pressure, are retained. 
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Downstream Processing 

After cell disruption, cell debris is removed by either low-speed, high- 
capacity centrifugation or membrane microfiltration. The protein product 
is precipitated from the crude lysate, the clarified lysate, or the cell-free 
culture medium with organic solvents (alcohol or acetone) or ammonium 
sulfate. Under these conditions, the target protein is usually enriched 
approximately two- to fivefold. Unfortunately, the cost of the precipitant 
can add significantly to the cost of the process. Alternatively, the crude 
protein mixtures are concentrated and fractionated by cross-flow ultrafil¬ 
tration through membranes that have a smaller average pore size than 
those used for either cell concentration or debris removal (Fig. 17.14B). This 
approach can be used with volumes ranging from 1 liter to several thou¬ 
sand liters and can be performed continuously, which means that large- 
volume systems are not required. Depending on the size and properties of 
the target protein, this method can yield 10- to 100-fold enrichment. 

On a large scale, it is impractical to remove small molecules, such as 
salts or organic solvents, from protein solutions by conventional laboratory 
procedures. Consequently, the same apparatus that is used to concentrate 
proteins by ultrafiltration has been developed for the large-scale dialysis of 
proteins (Fig. 17.16). In addition, by using two different-size membranes. 


FIGURE 17.17 Schematic representation of large-scale protein purification using ultra¬ 
filtration membranes. In the example shown, two membranes are used sequentially, 
a 100-kDa cutoff membrane and a 10-kDa cutoff membrane, in order to purify a 
30-kDa protein. The solution, which contains a protein mixture, is pumped through 
the 100-kDa cutoff membrane, with the larger proteins being retained and the 
smaller proteins, including the 30-kDa target protein, passing through the mem¬ 
brane. This protein solution is next pumped through a 10-kDa cutoff membrane, 
with the 30-kDa target protein being retained. The arrows indicate the direction of 
the liquid flow. The symbols are identical to those found in Fig. 17.16. 
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large volumes of target proteins with different molecular masses may be 
fractionated by ultrafiltration (Fig. 17.17). 

The required degree of purity of the final protein product depends on 
its end use. In some cases, such as enzymes for use with laundry detergent, 
crude preparations are satisfactory, but for other products, such as pharma¬ 
ceutical proteins, additional purification procedures are required. 

A number of proteins that are overproduced intracellularly are con¬ 
fined to insoluble particles (inclusion bodies) within the bacterium. After 
disruption, such inclusion bodies can be readily separated from the bulk of 
the remaining cell components. Initially, researchers found it difficult to 
solubilize inclusion bodies without irreversibly denaturing the protein, but 
protocols have now been devised to renature the proteins found in inclu¬ 
sion bodies. Of course, these additional steps increase the cost of the puri¬ 
fication process. 

Protein Solubilization 

In some instances, overexpression of a target protein can result in the pro¬ 
duction of both soluble and insoluble forms, which complicates the purifi¬ 
cation process. For example, when human insulin-like growth factor I 
(IGF-I), a 7.6-kilodalton (kDa) peptide, was expressed in E. coli cells, 
approximately 90% of the recombinant protein was localized in the £. coli 
periplasm (soluble and insoluble) and about 10% was found in the external 
medium (soluble). To recover both forms of IGF-I from the periplasm, high 
concentrations of urea and dithiothreitol at alkaline pH were added to 
solubilize the insoluble forms of the peptide in situ. This treatment kills but 
does not lyse the cells. Consequently, the cytoplasmic proteins remain 
within the cells. The solubilization procedure produces a highly viscous 
solution, which precludes removing the cells and cell debris by centrifuga¬ 
tion. Instead, an aqueous two-phase liquid extraction procedure that sepa¬ 
rates soluble and insoluble materials was developed for this purpose. Both 


FIGURE 17.18 Schematic representation of a process for the continuous thermal lysis 
of E. coli cells for the large-scale preparation of purified plasmid DNA. 
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the in situ solubilization and aqueous two-phase liquid extraction proce¬ 
dure are highly efficient; 80 to 95% of the IGF-I is recovered from 10-, 100-, 
or 1,000-liter fermentations by these methods. 


Large-Scale Production of Plasmid DNA 

Gene therapy and genetic immunization protocols are being tested in a 
large number of clinical trials. Many of these trials require plasmids as vec¬ 
tors to deliver the remedial DNA to the patient. As these procedures 
become more routine and are extended to many more patients, it will be 
necessary to produce plasmid DNA, initially in the 5- to 20-kilobase size 
range, on a large scale and in a highly purified form, i.e., pharmaceutical 
grade. 

A number of factors, including the choice of a host strain, such as E. coli 
K-12, that is safe and well characterized; growth conditions; defined 
medium; and the purification process that removes all of the undesired 
genomic DNA, RNA, proteins, lipids, and lipopolysaccharides, need to be 
considered for the large-scale production of plasmid DNA. Generally, to 
ensure that the plasmid DNA will be stable, a host strain should produce 
only low levels of nucleases. In addition, the presence of antibiotics in the 
culture medium should be avoided, since it is difficult to remove all traces 
of an antibiotic from the final preparation. For DNA isolation, the current 
method of choice for cell breakage is the alkaline-lysis procedure—treat¬ 
ment with 0.2 M NaOH and 1% sodium dodecyl sulfate—which breaks 
cells without disrupting plasmid DNA. The lysis solution must be added 
so that mixing is sufficient to lyse all of the cells and local pFI extremes are 
avoided while the shear that is generated by the mixing does not damage 
the plasmid DNA. After cell lysis, the precipitates formed, which contain 
cell debris, denatured proteins, and nucleic acids, must be removed. This is 
most commonly done by centrifugation using fixed-angle rotors, a process 
which is difficult to scale up. After removal of the precipitated material, 
plasmid purification is best achieved by anion-exchange chromatography, 
followed by size exclusion (gel filtration) chromatography. In the industrial 
production of many recombinant proteins, high-pressure homogenizers are 
used to continuously disrupt cells. Flowever, the use of such homogenizers 
typically results in the degradation of a large portion of the plasmid DNA 
as a consequence of the severe shear stress generated during cell breakage. 
To work out a process for the large-scale preparation of plasmid DNA, one 
group of researchers developed and optimized a unique continuous 
thermal-lysis protocol. In this procedure, cells were harvested after high- 
cell-density batch culture, filtered to remove most of the growth medium, 
and resuspended in lysis buffer containing lysozyme. After incubation at 
37°C and gentle stirring for 20 minutes, the cells were heated to 70°C for 20 
seconds and then filtered to remove the cell debris (Fig. 17.18). Using this 
procedure, researchers have reported obtaining 100 mg of high-quality 
plasmid DNA, free of contaminating chromosomal DNA, per liter of high- 
cell-density cell culture. The use of lysozyme is the most expensive compo¬ 
nent of this process. Nevertheless, as a consequence of the high plasmid 
yield, the avoidance of time-consuming and expensive centrifugation 
steps, the short processing time—17 liters of £. coli cells can be processed in 
~45 minutes—and the ease with which the process can be scaled up, this 
approach is likely to provide a highly effective means of preparing large 
amounts of highly purified plasmid DNA. 


720 CHAPTER 17 


SUMMARY 


T he large-scale production of genetically engineered organ¬ 
isms in industrial-size bioreactors (>1,000 liters) is not 
achieved merely by extrapolating directly from laboratory 
growth conditions (0.1 to 1.0 liter). The temperature, pH, rate 
and nature of mixing, oxygen demand for aerobic organisms, 
and nutrient levels must be taken into consideration when 
large bioreactors are being designed. 

Microbial fermentations can be performed in several dif¬ 
ferent ways. In batch fermentation, an inoculum of cells is 
added to fresh medium, and the fermentation is allowed to 
proceed without supplementation until the maximum amount 
of the target product is synthesized. Under these conditions, 
the cell culture passes through six phases of growth: lag, accel¬ 
eration, log, deceleration, stationary, and death. Protein pro¬ 
duction is optimal during the log phase, whereas peak 
production of many low-molecular-weight products occurs 
during the stationary phase. It is important to monitor batch 
fermentations closely to ensure that the cells are harvested at 
the appropriate time. In fed-batch fermentation, growth 
medium is added at various intervals, usually to prolong the 
log phase of the fermentation process. Continuous fermenta¬ 
tion entails adding fresh growth medium throughout the 
course of the fermentation and simultaneously removing cells 
and spent medium. 

Each of these fermentation systems has disadvantages and 
advantages for large-scale production of recombinant prod¬ 
ucts. Although continuous fermentation is still a relatively 
untried industrial process, the approach has some inherent 
benefits, making it likely that its use will become more wide¬ 
spread in the future. 

One way to increase the amount of a recombinant protein 
is to grow transformants to as high a cell density as possible. 
The best way to obtain high-density cell cultures is to use a 
fed-batch fermentation strategy. Fed-batch fermentation has 
been directly compared with batch fermentation for the pro¬ 
duction of several different proteins, and in all cases exam¬ 
ined, fed-batch fermentation has resulted in a higher yield of 
the target protein. 


To obtain the largest amount of product in a given volume, 
it is helpful to attain high cell densities, to avoid the loss of the 
recombinant plasmid, to utilize E. coli cells that can become 
quiescent during foreign-gene expression, to secrete the for¬ 
eign protein into the growth medium, and to avoid the forma¬ 
tion of acetate in the medium. 

There are three basic bioreactor configurations: STRs, 
bubble columns, and airlift reactors. Currently, STRs are used 
most frequently in industry, but interest in airlift bioreactors is 
increasing. In STRs, mixing is achieved by mechanical agita¬ 
tion. In airlift reactors, both aeration and mixing are per¬ 
formed by a gas, usually air, that is introduced through a 
sparger at the bottom of the vessel, with either internal baffles 
or external loops causing the fluid to circulate within the 
vessel. Bubble columns are similar to airlift reactors but lack 
the design features that cause the culture medium to circulate 
in the vessel. The problems of maintaining sterility, pH, tem¬ 
perature, and other fermentation parameters are overcome in 
different ways depending on the design of the bioreactor. 
Two-stage fermentation processes with either tandem airlift 
reactors or a single STR have been successfully used for the 
production of recombinant proteins. 

When the product is present within the cells, the cells can 
be harvested by either centrifugation or cross-flow filtration 
and lysed chemically, enzymatically, or mechanically. The pre¬ 
ferred forms of mechanical cell lysis include wet milling, high- 
pressure homogenization, and impingement. The product is 
then fractionated from the cell lysate. Ultrafiltration has been 
found to be an effective method for large-scale dialysis, con¬ 
centration, and initial fractionation of proteins produced by 
recombinant organisms. 

Scientists have begun to establish procedures for the large- 
scale isolation of plasmid DNA. These procedures must take 
into account the host cell and its growth and metabolism, 
plasmid size, cell lysis methods, and the complete removal of 
a number of potentially contaminating cell components. 


REFERENCES 

Aristidou, A. A., K. Y. San, and G. N. 
Bennett. 1995. Metabolic engineering 
of Escherichia coli to enhance recombi¬ 
nant protein production through ace¬ 
tate reduction. Biotechnol. Prog. 
11:475-478. 

Bailey, J. E., and D. F. Olis. 1977. 

Biochemical Engineering Fundamentals. 
McGraw-Hill Book Co., New York, 
NY. 

Charles, M. 1985. Fermentation scale- 
up: problems and possibilities. Trends 
Biotechnol. 3:134-139. 


Choi, J.-I., and S. Y. Lee. 1999. High- 
level production of poly(3-hydroxybu- 
tyrate-co-3-hydroxyvalerate) by 
fed-batch culture of recombinant 
Escherichia coli. Appl. Environ. Microbiol. 
65:4363M368. 

Datar, R. 1986. Economics of primary 
separation steps in relation to fermen¬ 
tation and genetic engineering. Process 
Biochem. 21:19-26. 

Donovan, R. S., C. W. Robinson, and 
B. R. Glick. 2000. Optimizing the 
expression of a monoclonal antibody 


fragment under the transcriptional 
control of the Escherichia coli lac pro¬ 
moter. Can. J. Microbiol. 46:532-541. 

Eiteman, M. A., and E. Altman. 2006. 
Overcoming acetate in Escherichia coli 
recombinant protein fermentations. 
Trends Biotechnol. 24:530-536. 

Engler, C. R. 1985. Disruption of 
microbial cells, p. 305-324. In C. L. 
Cooney, A. E. Humphrey, and M. 
Moo-Young (ed.). Comprehensive 
Biotechnology , vol. 2. Pergamon Press, 
Oxford, United Kingdom. 




Large-Scale Production of Proteins from Recombinant Microorganisms 


721 


Giorgio, R. J v and J. J. Wu. 1986. 
Design of large scale containment 
facilities for recombinant DNA fer¬ 
mentations. Trends Biotechnol. 4:60-65. 

Gosset, G v R. de Anda, N. Cruz, A. 
Martinez, R. Quintero, and F. Bolivar. 

1993. Recombinant protein production 
in cultures of an Escherichia coli trp2 
strain. Appl. Microbiol. Biotechnol. 
39:541-546. 

Grund, G., C. W. Robinson, and B. R. 
Glick. 1991. Cross-flow ultrafiltration 
of proteins, p. 69-83. In M. D. White, 

S. Reuveny, and A. Shafferman (ed.), 
Biologicals from Recombinant 
Microorganisms and Animal Cells: 
Production and Recovery. Verlag 
Chemie, Weinheim, Germany. 

Hagg, P., J. Wa de Pohl, F. 
Abdulkarim, and L. A. Isakson. 2004. 
A host/plasmid system that is not 
dependent on antibiotics and antibi¬ 
otic resistance genes for stable plasmid 
maintenance in Escherichia coli. ]. 
Biotechnol. 111:17-30. 

Hart, R. A., R M. Lester, D. H. 
Reifsnyder, J. R. Ogez, and S. E. 
Builder. 1994. Large scale, in situ iso¬ 
lation of periplasmic IGF-I from E. coli. 
Bio/Technology 12:113-117. 

Khosla, C., J. E. Curtis, J. DeModena, 
U. Rinas, and J. E. Bailey. 1990. 
Expression of intracellular hemoglobin 
improves protein synthesis in oxygen- 
limited Escherichia coli. Bio/Technology 
8:849-853. 

Kroner, K. H. 1986. Cross-flow filtra¬ 
tion in the downstream processing of 
enzymes: current status. Biotechnol. 
Forum 3:20-31. 

Kroner, K. H., H. Nissinen, and H. 
Zeigler. 1987. Improved dynamic fil¬ 
tration of microbial suspensions. Bio/ 
Technology 5:921-926. 

Lee, S. Y. 1996. High cell-density cul¬ 
ture of Escherichia coli. Trends 
Biotechnol. 14:98-105. 

Levy, M. S., R. D. O'Kennedy, P. 
Ayazi-Shamlou, and P. Dunnill. 2000. 
Biochemical engineering approaches 
to the challenges of producing pure 
plasmid DNA. Trends Biotechnol. 
18:296-305. 

Lim, H.-K., and K.-H. Jung. 1998. 
Improvement of heterologous protein 
productivity by controlling postinduc¬ 
tion specific growth rate in recombi¬ 
nant Escherichia coli under control of 


the p L promoter. Biotechnol. Prog. 
14:548-553. 

Lim, H.-K., S.-G. Kim, K.-H. Jung and 
J.-H. Seo. 2004. Production of the 
kringle fragments of human 
apolipoprotein(a) by continuous lac¬ 
tose induction strategy. J. Biotechnol. 
108:271-278. 

March, J. C., M. A. Eiteman, and E. 
Altman. 2002. Expression of an ana- 
plerotic enzyme, pyruvate carboxy¬ 
lase, improves recombinant protein 
production in Escherichia coli. Appl. 
Environ. Microbiol. 68:5620-5624. 

McKillip, E. R., A. S. Giles, M. H. 
Levner, P. P. Hung, and R. N. Hjorth. 

1991. Bioreactors for large-scale t-PA 
production. Bio/Technology 9:805-812. 

Mendoza-Vega, O., C. Hebert, and S. 
W. Brown. 1994. Production of recom¬ 
binant hirudin by high cell density 
fed-batch cultivations of a 
Saccharomyces cerevisiae strain: physio¬ 
logical considerations during the bio¬ 
process design. /. Biotechnol. 

32:249-259. 

Merchuk, J. C. 1990. Why use airlift 
bioreactors? Trends Biotechnol. 8:66-71. 

Mukherjee, K. J., D. C. D. Rowe, 

N. A. Watkins, and D. K. Summers. 

2004. Studies of single-chain antibody 
expression in quiescent Escherichia coli. 
Appl. Environ. Microbiol. 70:3005-3012. 

Park, T. H., J.-H. Seo, and H. C. Lim. 

1991. Two-stage fermentation with 
bacteriophage A, as an expression 
vector in Escherichia coli. Biotechnol. 
Bioeng. 37:297-302. 

Paulson, D. J., R. L. Wilson, and D. D. 
Spatz. 1984. Cross-flow membrane 
technology and its applications. Food 
Technol. December 1984:77-87. 

Prazeres, D. M. F., G. N. M. Ferreira, 
G. A. Monteiro, C. L. Cooney, and J. 
M. S. Cabral. 1999. Large-scale pro¬ 
duction of pharmaceutical-grade 
plasmid DNA for gene therapy: prob¬ 
lems and bottlenecks. Trends Biotechnol. 
17:169-174. 

Ramirez, D. M., and W. E. Bentley. 

1995. Fed-batch feeding and induction 
policies that improve foreign protein 
synthesis and stability by avoiding 
stress response. Biotechnol. Bioeng. 
47:596-608. 

Reuss, M. 1995. Stirred tank bioreac¬ 
tors, p. 207-255. In J. A. Asenjo and J. 
Merchuk (ed.), Bioreactor System 


Design. Marcel Dekker, Inc., New York, 
NY. 

Riesenberg, D., and R. Guthke. 1999. 
High-cell-density cultivation of micro¬ 
organisms. Appl. Microbiol. Biotechnol. 
51:422-430. 

Robinson, D. K., C. P. Chan, C. Yu Ip, 
P. K. Tsai, J. Tung, T. C. Seamans, 

A. B. Lenny, D. K. Lee, J. Irwin, and 
M. Silberklang. 1994. Characterization 
of a recombinant antibody produced 
in the course of a high yield fed-batch 
process. Biotechnol. Bioeng. 44:727-735. 

Rowe, D. C. D., and D. K. Summers. 

1999. The quiescent-cell expression 
system for protein synthesis in 
Escherichia coli. Appl. Environ. Microbiol. 
65:2710-2715. 

Sauer, T., C. W. Robinson, and B. R. 
Glick. 1989. Disruption of native and 
recombinant Escherichia coli in a high- 
pressure homogenizer. Biotechnol. 
Bioeng. 33:1330-1342. 

Sayadi, S., M. Nasri, F. Berry, J. N. 
Barbotin, and D. Thomas. 1987. Effect 
of temperature on the stability of 
plasmid pTG201 and productivity of 
xylE gene product in recombinant 
Escherichia coli: development of a two- 
stage chemostat with free and immo¬ 
bilized cells. J. Gen. Microbiol. 
133:1901-1908. 

Schleef, M. 1999. Issues for large-scale 
DNA manufacturing, p. 443M70. In 
H. J. Rehm and G. Reed (ed.), 
Biotechnology: a Multi-Volume 
Comprehensive Treatise, vol. 5a. 
Recombmant Proteins, Monoclonal 
Antibodies, and Therapeutic Genes. 

Wiley-VCH, New York, NY. 

Schiigerl, K., and A. Liibbert. 1995. 
Pneumatically agitated bioreactors, p. 
257-303. In J. A. Asenjo and J. Merchuk 
(ed.), Bioreactor System Design. Marcel 
Dekker, Inc., New York, NY. 

Schiitte, H., and M.-R. Kula. 1990. 
Pilot- and process-scale techniques for 
cell disruption. Biotechnol. Appl. 
Biochem. 12:599-620. 

Seigel, R., and D. D. Y. Ryu. 1985. 
Kinetic study of instability of recombi¬ 
nant plasmid pPLc23trpAl in E. coli 
using two-stage continuous culture 
system. Biotechnol. Bioeng. 27:28-33. 

Siegel, M. H., H. Hallaille, and J. C. 
Merchuk. 1988. Air-lift reactors: 
design, operation, and applications. 
Adv. Biotechnol. Processes 7:79-124. 


722 


CHAPTER 17 


Sletta, H., A. Tondervik, S. Hakvag, 

T. E. Vee Aune, A. Nedal, R. Aune, G. 
Evensen, S. Valla, T. E. Ellingsen, and 
T. Brautaset. 2007. The presence of 
N-terminal secretion signals leads to 
strong stimulation of the total expres¬ 
sion levels of three tested medically 
important proteins during high-cell- 
density cultivations of Escherichia coli. 
Appl. Environ. Microbiol. 73:906-912. 

Strandberg, L., L. Andersson, and 
S.-O. Enfors. 1994. The use of fed 
batch cultivation for achieving high 
cell densities in the production of a 
recombinant protein in Escherichia coli. 
FEMS Microbiol. Rev. 14:53-56. 

Strandberg, L., K. Kohler, and S.-O. 
Enfors. 1991. Large-scale fermentation 
and purification of a recombinant pro¬ 
tein from Escherichia coli. Process 
Biochem. 26:225-234. 

Strathman, H. 1985. Membranes and 
membrane processes in biotechnology. 
Trends Biotechnol. 3:112-118. 


Tanny, G. B., D. Mirelman, and T. 
Pistole. 1980. Improved filtration tech¬ 
niques for concentrating and har¬ 
vesting bacteria. Appl. Environ. 
Microbiol. 40:269-273. 

Tutunjian, R. S. 1985. Scale-up consid¬ 
erations for membrane processes. Bio/ 
Technology 3:615-626. 

Van Brunt, J. 1985. Scale-up: the next 
hurdle. Bio/Technology 3:419M24. 

Van Brunt, J. 1986. Fermentation eco¬ 
nomics. Bio/Technology 4:395-401. 

Wang, Z. W., Y. Chen, and Y.-P. Chao. 

2006. Enhancement of recombinant 
protein production in Escherichia coli 
by coproduction of aspartase. /. 
Biotechnol. 124:403-411. 

White, M. D., B. R. Glick, and C. W. 
Robinson. 1995. Bacterial, yeast and 
fungal cultures: the effect of microor¬ 
ganism type and culture characteristics 
on bioreactor design and operation, p. 
47-87. In J. A. Asenjo and J. Merchuk 


(ed.), Bioreactor System Design. Marcel 
Dekker, Inc., New York, NY. 

Whitney, G. D., B. R. Glick, and C. W. 
Robinson. 1989. Induction of T4 DNA 
ligase in a recombinant strain of 
Escherichia coli. Biotechnol. Bioeng. 
33:991-998. 

Yamane, T. 1995. Bioreactor operation 
modes, p. 479-509. In J. A. Asenjo and 
J. Merchuk (ed.). Bioreactor System 
Design. Marcel Dekker, Inc., New York, 
NY. 

Zhu, K., H. Jin, Y. Ma, Z. Ren, C. 

Xiao, Z. He, F. Zhang, Q. Zhu, and B. 
Wang. 2005. A continuous thermal 
lysis procedure for the large-scale 
preparation of plasmid DNA. J. 
Biotechnol. 118:257-264. 


REVIEW QUESTIONS 


1. What are the differences between batch, fed-batch, and 
continuous fermentations? 

2. How has fed-batch fermentation been used to improve the 
production of the insulin B peptide, IFN-y, and Fab frag¬ 
ment? 

3. What parameters must be monitored and controlled in an 
optimized fermentation process? 

4. What is the effect of a recombinant plasmid on the growth 
of microbial cells? 

5. How does the mixing of a growing microbial culture affect 
the transfer of oxygen from the growth medium to the cells? 

6. What are the advantages of a high cell density during a 
large-scale fermentation? What conditions lead to a high cell 
density during a large-scale fermentation? 

7. What strategies can be employed to prevent acetate inhibi¬ 
tion of the growth of recombinant E. coli strains? 

8. What are the relative advantages and disadvantages of 
using an STR or an airlift fermenter? 

9. Compare the growth and induction of a recombinant 
microbial culture using (1) two reactors in tandem and (2) a 
single reactor. 

10. What is downstream processing? 

11. What strategy would you use to purify a recombinant 
protein that is secreted into the growth medium? 


12. What are the advantages and disadvantages of large-scale 
mechanical lysis of cells compared to chemical lysis? 

13. How are cells mechanically disrupted using (1) wet 
milling, (2) high-pressure homogenization, and (3) impinge¬ 
ment? 

14. How are microbial cells concentrated after the fermenta¬ 
tion stage of a biotechnological process? What are the advan¬ 
tages and disadvantages of these procedures? 

15. How are large volumes of protein solutions partially puri¬ 
fied using ultrafiltration? 

16. How can the activity of an insoluble recombinant protein 
be recovered? 

17. What factors should be considered for the large-scale iso¬ 
lation of plasmid DNA? 

18. How can "quiescent E. coli cells" be engineered to produce 
large amounts of foreign protein? 

19. What strategy would you use to ensure that a plasmid 
encoding a target protein is not lost during the large-scale 
growth of a recombinant bacterium? 

20. What are some of the advantages of secreting a recombi¬ 
nant protein into the growth medium? 
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I N the past, high-yielding strains of many different crop plants and 
farm animals have been successfully developed by selective cross¬ 
breeding. This time-consuming process has been superseded in part by 
the development of methods for the genetic engineering of higher organ¬ 
isms. Now, genes that contribute to specific traits can be introduced into 
plants and animals and then passed on from one generation to the next. 
How these transgenic plants and animals are formed and what sorts of 
traits are being manipulated are examined in part III. 

A number of transgenic plants have been engineered to be able to over¬ 
come a variety of biotic and abiotic stresses, including insect predation, 
viral infection, herbicides, pathogenic fungi and bacteria, oxidative stress, 
salt stress, and drought. Transgenic plants that have a significantly 
improved nutritional content, produce fruit or vegetables with enhanced 
taste or appearance, or produce flowers with altered pigmentation have 
been created. In addition, some transgenic plants have been used to facili¬ 
tate environmental cleanup (phytoremediation) while others have been 
used as living bioreactors in the production of a range of foreign proteins, 
including many therapeutic agents. Transgenesis of animals has included 
studies of cattle, sheep, goats, birds, and fish, all based on the principles 
that have been established for creating transgenic mice. Some of these 
transgenic animals have been developed to produce therapeutic proteins in 
their milk or eggs. 
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C ONSIDERABLE EFFORT HAS GONE INTO DEVELOPING VARIETIES of plants 
that produce increased yields and have enhanced nutritional value. 
Although much of this endeavor has been directed toward the three 
major grains—corn (maize), wheat, and rice—successful breeding pro¬ 
grams for other food plants and horticultural species have also been estab¬ 
lished. Recombinant DNA technology, which has been used extensively 
with microbial systems, is also an important tool for the direct genetic 
manipulation of plants. There are a number of effective DNA delivery sys¬ 
tems and expression vectors that work with a range of plant cells. 
Furthermore, most plant cells are totipotent—meaning that an entire plant 
can be regenerated from a single plant cell—so fertile plants that carry an 
introduced gene(s) in all cells (i.e., transgenic plants) can be produced from 
genetically engineered cells. If the transgenic plant flowers and produces 
viable seed, the desired trait is passed on to successive generations. 

There are three major reasons for developing transgenic plants. First, 
the addition of a gene(s) often improves the agricultural, horticultural, or 
ornamental value of a crop plant. Second, transgenic plants can act as 
living bioreactors for the inexpensive production of economically impor¬ 
tant proteins or metabolites. Third, plant genetic transformation (transgen¬ 
esis) provides a powerful means for studying the actions of genes during 
development and other biological processes. 

Some of the genetically determined traits that can be introduced into 
plants by a single gene or, possibly, a small cluster of genes are insecticidal 
activity, protection against viral infection, resistance to herbicides, protec¬ 
tion against pathogenic fungi and bacteria, delay of senescence, tolerance 
of environmental stresses, altered flower pigmentation, improved nutri¬ 
tional quality of seed proteins, increased postharvest shelf life, and self¬ 
incompatibility. In addition, transgenic plants can be made to produce a 
variety of useful compounds, including therapeutic agents, polymers, and 
diagnostic tools, such as antibody fragments. Alternatively, they can be 
engineered to synthesize viral antigenic determinants and, after ingestion, 
can be used as edible vaccines. To date, over 150 different plant species 
have been genetically transformed, including many crop and forest species. 
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BOX 18.1 


Is the Debate over 
Genetically Modified Foods 
Effectively Over? 

R ice is the most important crop in 
the world. In 2004, China, the 
world's largest producer of rice, 
announced a ramping up of efforts to 
commercialize genetically modified 
rice. India, the second-largest pro¬ 
ducer of rice, was quick to follow suit. 
With the two largest rice producers in 
the world (with ~40% of the world's 
population) switching to genetically 
modified rice, the rest of the world 
will eventually have little choice but 
to accept genetically modified foods. 

From 1996 to 2007, the global area 
devoted to transgenic crops has 
increased from 1.7 million to 114 mil¬ 


lion hectares (ha). In 2007, the major 
producers of transgenic plants were 
the United States (57.7 million ha 
planted), Argentina (19.1 million ha), 
Brazil (15.0 million ha), Canada (7.0 
million ha), India (6.2 million ha), 
China (3.8 million ha), Paraguay (2.6 
million ha), and South Africa (1.8 mil¬ 
lion ha). Major transgenic crops 
worldwide include (in order of 
number of hectares planted) soybean, 
corn, cotton, canola, rice, squash, 
papaya, alfalfa, wheat, and eggplant. 
The major traits that have been intro¬ 
duced into plants include herbicide 
tolerance and insect resistance. 

In 2002, the Food and Agriculture 
Organization of the United Nations 
endorsed the development and use of 
genetically modified crops. According 
to researcher Florence Wambugu of 


Nairobi, Kenya, "The African conti¬ 
nent urgently needs agricultural bio¬ 
technology, including transgenic 
crops, in order to improve food pro¬ 
duction. Famine provides critics with 
an opportunity to promote an antibio¬ 
tech message that only results in mil¬ 
lions of people, who urgently need 
food, starving to death." Wambugu 
urged the public to recognize the dif¬ 
ference in needs between Europe and 
Africa. Europe, with a population that 
is under control, has surplus food and 
does not experience hunger, whereas 
Africa, in contrast, experiences mass 
starvation and death. By mid-2008. 
South Africa was the only country in 
Africa to have approved the commer¬ 
cial use of genetically modified crops. 


in over 50 countries worldwide. Plant biotechnology is having an enor¬ 
mous impact on plant-breeding programs because it significantly decreases 
the 10 to 15 years that it takes to develop a new variety using traditional 
plant-breeding techniques. 

By mid-2008, researchers had reported the complete DNA sequences of 
hundreds of microorganisms and dozens of animals, but only three plants: 
Arabidopsis thaliana, rice, and poplar. At that time, the genome sequencing 
of several other plants, including corn, soybean, canola, tomato, cotton, 
potato, cassava, sorghum, grape, and peach, had been initiated. While the 
study of plant genes and genomes clearly lags behind studies of animals, it 
is gaining momentum, so that within the next 5 to 10 years, a wealth of 
information, with an enormous impact on plant biotechnology, is expected 
to become available. 

Despite all of the progress that has been made in the development of 
transgenic plants for a wide variety of purposes, a vocal minority of indi¬ 
viduals in North America and a larger number in Europe still oppose the 
use of this technology. Nevertheless, with each succeeding year since the 
mid-1990s, the use of transgenic crops has continued to increase both in 
absolute terms and in the number of countries using this technology (Box 
18.1). It is expected that in the not too distant future, the majority of agri¬ 
cultural crops worldwide will be transgenic. 


Plant Transformation with the Ti Plasmid of A. tumefaciens 

The gram-negative soil bacterium Agrobacterium tumefaciens is a phyto¬ 
pathogen that, as a normal part of its life cycle, genetically transforms plant 
cells. This genetic transformation leads to the formation of crown gall 
tumors, which interfere with the normal growth of an infected plant (Fig. 
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18.1). This agronomically important disease affects only dicotyledonous 
plants (dicots), including grapes, stone fruit trees (e.g., peaches), and roses. 

Crown gall formation is the consequence of the transfer, integration, 
and expression of genes of a specific segment of bacterial plasmid DNA— 
called the T-DNA (transferred DNA)—into the plant cell genome. The 
T-DNA is actually part of the "tumor-inducing" (Ti) plasmid that is carried 
by most strains of A. tumefaciens. Depending on the Ti plasmid, the length 
of the T-DNA region can vary from approximately 10 to 30 kilobase pairs 
(kb). Strains of A. tumefaciens that do not possess a Ti plasmid cannot 
induce crown gall tumors. 

The initial step in the infection process is the attachment of A. tumefa¬ 
ciens to a plant cell at the site of an open wound, often at the base of the 
stem, i.e., the crown, of the plant. After the initial attachment step, A. tume¬ 
faciens produces a network of cellulose fibrils that bind the bacterium 
tightly to the plant cell surface. Originally, it was thought that A. tumefa¬ 
ciens infected wounded plants because the physical barrier of the cell wall 
had been breached by injury, thereby facilitating entry of the bacterium. 
However, it is now recognized that these bacteria respond to certain plant 
phenolic compounds, such as acetosyringone and hydroxyacetosyringone 
(Fig. 18.2), which are excreted by susceptible wounded plants. These 
wound response compounds resemble some of the products of phenylpro- 
panoid metabolism, which is the major plant pathway for the synthesis of 
plant secondary metabolites, such as lignins and flavonoids. These small 
molecules (i.e., acetosyringone and hydroxysyringone) act to induce the 
virulence (vir) genes that are carried on the Ti plasmid (Fig. 18.3). 

The vir genes are located on a 35-kb region of the Ti plasmid that lies 
outside of the T-DNA region. There are 25 vir genes arranged in seven 
operons on the plasmid. The products of the vir genes are essential for the 
transfer and integration of the T-DNA region into the genome of a plant 
cell. 

After a Ti plasmid-carrying cell of A. tumefaciens attaches to a host plant 
cell and the vir genes are induced, the T-DNA is transferred by a process 
that is similar to plasmid transfer from donor to recipient cells during bac¬ 
terial conjugation. In this model, the T-DNA is transferred as a linear, 
single-stranded molecule from the Ti plasmid, enters the plant cell, and 
eventually becomes integrated into the plant chromosomal DNA. 

The formation of the single-stranded form of T-DNA is initiated by 
strand-specific cutting, by an enzyme encoded by one of the vir genes, at 
both borders of the intact T-DNA region. The 5' end of the single-stranded 
T-DNA carries the right-border sequence, and the left-border sequence is at 
the 3' end. The integration of the T-DNA into the plant genome depends on 
specific sequences that are located at the right border of the T-DNA. This 
border contains a repeating unit that consists of 25 base pairs (bp) (Fig. 
18.4). Although the left border contains a similar 25-bp repeat (Fig. 18.4), 
deletion studies have shown that this region is not involved in the integra¬ 
tion process. 

During the insertion of the T-DNA into the plant chromosomal DNA, 
short deletions of the plant DNA are often produced at the junction 
between the T-DNA and the plant chromosomal DNA. In addition, while 
the insertion of the T-DNA into the plant DNA occurs at random sites, the 
T-DNA borders exhibit some homology with the plant DNA at the site of 
insertion. 
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FIGURE 18.1 Infection of a plant with A. 
tumefaciens and formation of a crown 

gall. 


FIGURE 18.2 Structures of the plant mol¬ 
ecules acetosyringone and hydroxyac¬ 
etosyringone. These compounds are 
released in response to wounding and 
can induce the vir genes of the Ti 
plasmid. 
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FIGURE 18.3 Schematic representation of a Ti plasmid. The T-DNA is defined by its 
left and right borders and includes genes for the biosynthesis of auxin, cytokinin, 
and an opine; these genes are transcribed and translated only in plant cells. Outside 
of the T-DNA region, there is a cluster of vir genes, a gene(s) that encodes an 
enzyme(s) for opine catabolism, and an origin of DNA replication (ori) that permits 
the plasmid to be stably maintained in A. tumefaciens. None of these features is 
drawn to scale. 


Most of the genes that are located within the T-DNA region are acti¬ 
vated only after the T-DNA is inserted into the plant genome. This reflects 
the fact that these are essentially plant genes, which cannot be expressed in 
bacteria because of the differences in transcriptional and translational regu¬ 
latory sequences between the two types of organisms. The products of 
these genes are responsible for crown gall formation. The T-DNA region 
includes the genes iaaM and iaaH. This pair of genes encodes enzymes that 
synthesize the plant hormone auxin (indoleacetic acid). Specifically, iaaM 
codes for the enzyme tryptophan 2-monooxygenase, which converts tryp¬ 
tophan to indole 3-acetamide, and iaaH encodes indole 3-acetamide hydro¬ 
lase, which converts indole 3-acetamide to indoleacetic acid (Fig. 18.5A). In 
addition, the T-DNA region carries the tmr gene (also known as ipt), which 
encodes isopentenyltransferase. This enzyme adds an isopentenyl side 
chain to 5'-AMP to form isopentenyladenosine 5'-phosphate, the first com¬ 
mitted step in the synthesis of the cytokinin isopentenyladenine (Fig. 
18.5B). Hydroxylation of these two molecules by plant enzymes generates 
the cytokinins called transzeatin and transribosylzeatin, respectively. Both 
auxin and the cytokinins regulate plant cell growth and development. In 
excess, they can cause the plant to develop tumorous growths, such as 
crown galls. 

In addition to auxin and cytokinin biosynthesis genes, the T-DNA 
region from each specific Ti plasmid carries a gene for the synthesis of a 
molecule called an opine. Opines are unique and unusual condensation 
products of either an amino acid and a keto acid or an amino acid and a 

FIGURE 18.4 Conserved bases on the right and left borders of the T-DNA of Ti plas¬ 
mids. N indicates any one of the four nucleotides, i.e., there is no sequence conser¬ 
vation at these positions. 

Right 5'-TGNC AGGATATATNNNNNNG TNANN-3' 


Left 


5'-TGGCAGGATATATNNNNNTGTAAAN-3' 
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Indoleacetic acid 




FIGURE 18.5 Biosynthesis of auxin and cytokinin by the enzymes encoded by the 
T-DNA genes of the Ti plasmid of A. tumefaciens. (A) The auxin pathway involves 
the conversion of tryptophan to indole 3-acetamide by tryptophan monooxygenase 
and then indole 3-acetamide to indoleacetic acid by indole-3-acetamide hydrolase. 
(B) The cytokinin synthesis reaction entails the attachment of an isopentenyl moiety 
from isopentenyl diphosphate (IPP) to 5'-AMP by the enzyme isopentenyltrans- 
ferase to form isopentenyl adenosine monophosphate (IPA). 


sugar. For example, the condensation product of arginine and pyruvic acid 
is called octopine, arginine with a-ketoglutaraldehyde is nopaline, and 
agropine is a bicyclic sugar derivative of glutamic acid (Fig. 18.6). The 
opines are synthesized within the crown gall and then secreted. They can 
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FIGURE 18.6 Chemical structures of three opines: octopine, nopaline, and agropine. 


be used as a carbon source, and sometimes also as a nitrogen source, by any 
A. tumefaciens cell that carries a Ti plasmid-bome gene for the catabolism of 
that particular opine (Fig. 18.3). The opine catabolism gene(s) is on the Ti 
plasmid and is not part of the T-DNA region. All other soil microorganisms 
that have been tested are incapable of utilizing opines as a carbon source. 
Thus, a unique set of mechanisms has evolved whereby each strain of A. 
tumefaciens genetically engineers plant cells to be biological factories for the 
production of a carbon compound that it alone is able to use. 


Ti Plasmid-Derived Vector Systems 

The simplest way to exploit the ability of the Ti plasmid to genetically 
transform plants would be to insert a desired DNA sequence into the 
T-DNA region and then use the Ti plasmid and A. tumefaciens to deliver and 
insert this gene(s) into the genome of a susceptible plant cell. However, 
although the Ti plasmids are effective as natural vectors, they have several 
serious limitations as routine cloning vectors. 

• The production of phytohormones by transformed cells growing in 
culture prevents them from being regenerated into mature plants. 
Therefore, the auxin and cytokinin genes must be removed from any 
Ti plasmid-derived cloning vector. 
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• A gene encoding opine synthesis is not useful to a transgenic plant 
and may lower the final plant yield by diverting plant resources into 
opine production. Therefore, the opine synthesis gene should be 
removed. 

• Ti plasmids are large (approximately 200 to 800 kb). For recombinant 
DNA experiments, however, a much smaller version is preferred, so 
large segments of DNA that are not essential for a cloning vector 
must be removed. 

• Because the Ti plasmid does not replicate in Escherichia coli, the con¬ 
venience of perpetuating and manipulating Ti plasmids carrying 
inserted DNA sequences in that bacterium is not available. 

• Transfer of the T-DNA, which begins from the right border, does not 
always end at the left border. Rather, vector DNA sequences past the 
left border are often transferred, although the transfer of these 
sequences is not often tested for. 

To overcome these constraints, recombinant DNA technology was used 
to create a number of Ti plasmid-based vectors. These vectors are similarly 
organized and contain the following components. 

• A selectable marker gene, such as neomycin phosphotransferase, 
that confers kanamycin resistance on transformed plant cells. 
Because the neomycin phosphotransferase gene, as well as many of 
the other marker genes used in plant transformation, is prokaryotic 
in origin, it is necessary to put it under the control of plant (eukary¬ 
otic) transcriptional regulation signals, including both a promoter 
and a termination-polyadenylation sequence, to ensure that it is 
efficiently expressed in transformed plant cells. 

• An origin of DNA replication that allows the plasmid to replicate in 
£. coli. In some vectors, an origin of replication that functions in A. 
tumefaciens has also been added. 

• The right border sequence of the T-DNA region. This region is abso¬ 
lutely required for T-DNA integration into plant cell DNA, although 
most cloning vectors include both a right and a left border 
sequence. 

• A polylinker (multiple cloning site) to facilitate insertion of the 
cloned gene into the region between T-DNA border sequences. 

• A "killer" gene encoding a toxin downstream from the left border to 
prevent unwanted vector DNA past the left border from being incor¬ 
porated into transgenic plants. If this incorporation occurs, and the 
killer gene is present, the transformed cells will not survive. 

Because these cloning vectors lack vir genes, they cannot by themselves 
effect the transfer and integration of the T-DNA region into recipient plant 
cells. Two different approaches have been used to achieve these ends. In 
one approach, a binary vector system is used (Fig. 18.7A). The binary 
cloning vector contains either £. coli and A. tumefaciens origins of DNA 
replication, i.e., an £. coli-A. tumefaciens shuttle vector, or a single broad- 
host-range origin of DNA replication. In either case, no vir genes are 
present on a binary cloning vector. All the cloning steps are carried out in 
£. coli before the vector is introduced into A. tumefaciens. The recipient A. 
tumefaciens strain carries a modified (defective, or disarmed) Ti plasmid 
that contains a complete set of vir genes but lacks portions, or all, of the 
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FIGURE 18.7 Two Ti plasmid-derived cloning 
vector systems. (A) The binary cloning 
vector has origins of DNA replication (ori) 
for both E. coli and A. tumefaciens (or a 
broad-host-range origin), a selectable 
marker gene that can be used in either E. coli 
or A. tumefaciens, and both a target gene and 
a plant selectable marker gene inserted 
between the T-DNA left and right borders. 
(B) The cointegrate cloning vector (top) car¬ 
ries only an E. coli origin of replication and 
cannot exist autonomously within A. tume¬ 
faciens. It also contains a selectable marker 
that can be used in either E. coli or A. tume¬ 
faciens, a T-DNA right border, a plant select¬ 
able marker (reporter) gene, a target gene, 
and a sequence of Ti plasmid DNA that is 
homologous to a segment on the disarmed 
Ti plasmid. The disarmed Ti plasmid 
(middle) contains the T-DNA left border, 
the vir gene cluster, and an A. tumefaciens 
ori. Following recombination between the 
cointegrate cloning vector and the disarmed 
Ti plasmid, the final recombinant plasmid 
(bottom) has the T-DNA left and right bor¬ 
ders bracketing the cloned and plant 
reporter genes. 


Recombine 


Recombinant 
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T-DNA region, so that this T-DNA cannot be transferred. With this system, 
the defective Ti plasmid synthesizes the vir gene products that mobilize the 
T-DNA region of the binary cloning vector. By providing the proteins 
encoded by the vir genes, the defective Ti plasmid acts as a helper plasmid, 
enabling the T-DNA from the binary cloning vector to be inserted into the 
plant chromosomal DNA. Since transfer of the T-DNA is initiated from the 
right border, the selectable marker, which will eventually be used to detect 
the presence of the T-DNA inserted into the plant chromosomal DNA, is 
usually placed next to the left border. If the selectable marker were adjacent 
to the right border, transfer of only a small portion of the T-DNA would 
yield plants that contained the selectable marker but not the gene of 
interest. A few binary vectors have been designed to include two plant 
selectable markers, one adjacent to the right border and the other adjacent 
to the left border. 

In the second approach, called the cointegrate vector system, the 
cloning (cointegrate) vector has a plant selectable marker gene, the target 
gene, the right border, an E. coli origin of DNA replication, and a bacterial 
selectable marker gene. The cointegrate vector recombines with a modified 
(disarmed) Ti plasmid that lacks both the tumor-producing genes and the 
right border of the T-DNA within A. tumefaciens, and the entire cloning 
vector becomes integrated into the disarmed Ti plasmid to form a recombi¬ 
nant Ti plasmid (Fig. 18.7B). The cointegrate cloning vector and the dis¬ 
armed helper Ti plasmid both carry homologous DNA sequences that 
provide a shared site for in vivo homologous recombination; normally 
these sequences lie inside the T-DNA region. Following recombination, the 
cloning vector becomes part of the disarmed Ti plasmid, which provides 
the vir genes necessary for the transfer of the T-DNA to the host plant cells. 
The only way that this cloning vector can be maintained in A. tumefaciens is 
as part of a cointegrate structure. In this cointegrated configuration the 
genetically engineered T-DNA region can be transferred to plant cells. 

A practical problem that arises when using binary vectors is that their 
relatively large size (usually >10 kb) often makes it difficult and inconve¬ 
nient to manipulate them in vitro. In addition, larger plasmids tend to have 
fewer unique restriction sites for cloning purposes. For these reasons, it is 
advantageous to develop and use smaller binary vectors. Based on the 
DNA sequence of a commonly used binary vector, pBIN19, it was predicted 
that more than half of the DNA could be deleted and the vector would still 
be completely functional. Thus, instead of the 11.8-kb size of the original 
vector, a 3.5-kb mini-binary vector (pCB301) was constructed (Fig. 18.8). 
This minivector, which can be used to clone DNA fragments to be trans¬ 
ferred into the plant genome, cannot be introduced into A. tumefaciens by 
conjugation because certain regions of DNA required for conjugal transfer 
have been deleted. Flowever, electroporation can be used as an alternative 
means. To facilitate the use of the minivector, a number of derivatives of 
pCB301 were constructed. For example, a bar gene, together with a plant 
promoter and transcription termination region, encoding the enzyme phos- 
phinothricin acetyltransferase was inserted into the multiple cloning site so 
that transformants expressing this gene would be easily selected. Adjacent 
to the bar gene but in the opposite orientation is an expression cassette 
which includes a 35S promoter, a DNA sequence to target the protein for 
expression in either chloroplasts or mitochondria, a translational enhancer 
element (not shown in Fig. 18.8) that increases the level of expression of the 
protein encoded by the cloned gene, a portion of the multiple cloning site. 
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FIGURE 18.8 The mini-binary vector pCB301 system. oriV, part of the origin of repli¬ 
cation; npt, neomycin phosphotransferase gene; trfA, part of the origin of replica¬ 
tion; RB, right border of T-DNA; MCS, multiple cloning site; LB, left border of 
T-DNA; P 35S , 35S constitutive promoter from cauliflower mosaic virus; TP, targeting 
protein sequence; T 35S , transcription termination sequence from the cauliflower 
mosaic virus 35S gene; bar, gene for phosphinothricin acetyltransferase. By varying 
the DNA sequence of the TP, the protein encoded by the introduced gene may be 
targeted to either the mitochondria or chloroplast. Adapted from Xiang et al.. Plant 
Mol Biol. 40:711-717,1999. 


and a transcription termination sequence (Fig. 18.8). These derivatives of 
the minivector pCB301 are flexible and easy to use and contain a variety of 
unique restriction enzyme sites in the multiple cloning site. After the target 
gene has been cloned into the multiple cloning site, the final construct is 
introduced into A. tumefaciens by electroporation. 

In many instances it may be advantageous to transform plants with 
several foreign genes, for example, genes that encode an entire biochemical 
pathway. While this is not yet commonly done, it is nevertheless possible 
to introduce a large amount of foreign DNA into plants. Although the 
transformation efficiency is low, plants have been successfully transformed 
with large DNA fragments ranging from 30 to 150 kb. 

Although A. tumefaciens-mediated gene transfer systems are effective 
in several species, monocotyledonous plants (monocots), including the 
world's major cereal crops (rice, wheat, and corn), are not readily trans¬ 
formed by A. tumefaciens. Flowever, by refining and carefully controlling 
conditions, protocols have been devised for the transformation of com and 
rice by A. tumefaciens carrying Ti plasmid vectors. For example, immature 
com embryos were immersed in an A. tumefaciens cell suspension for a few 
minutes and then incubated for several days at room temperature in the 
absence of selective pressure. The embryos were then transferred to a 
medium with a selective antibiotic that allowed only transformed plant 
cells to grow. These cells were maintained in the dark for a few weeks. 
Finally, the mass of transformed plant cells was transferred to a different 
growth medium that contained plant hormones to stimulate differentiation 
and incubated in the light, which permitted regeneration of whole trans¬ 
genic plants. Many of the early plant transformation experiments were 
conducted with limited-host-range strains of Agrobacterium. However, 
more recently, broad-host-range strains that infect most plants have been 
tested and found to be effective, so many of the plant species that previ¬ 
ously appeared to be refractory to transformation by A. tumefaciens can 
now be transformed. Thus, when setting out to transform a new plant spe¬ 
cies, it is necessary to determine which Agrobacterium strain and Ti plasmid 
are best suited to that particular plant. In addition, modification of the 
tissue culture conditions by the inclusion of antioxidants during transfer- 
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mation of grape, rice, com, or soybean has been found to increase the 
transformation frequencies of those plant cells. 

A systematic examination of the conditions that are used in 
Agrobacterium -mediated plant transformation revealed that ethylene sig¬ 
nificantly decreased the transfer of genes to plant genomes. Ethylene is 
produced as a consequence of Agrobacterium infection of plants. To remedy 
this, a bacterial gene encoding aminocyclopropane-l-carboxylate (ACC) 
deaminase, which when expressed can lower plant ethylene levels (see 
chapter 15), was introduced into an A. tumefaciens strain that is utilized to 
introduce foreign DNA into plants. When melon cotyledon segments were 
genetically transformed using the A. tumefaciens strain expressing ACC 
deaminase, the transformation frequency of the plants (as judged by the 
level of introduced marker enzyme activity) increased significantly (Fig. 
18.9). Although this innovation has yet to be tested with other plants, it is 
hoped that the introduction of this ethylene-lowering gene will increase the 
transformation frequencies for a wide range of different plants. 


Physical Methods of Transferring Genes to Plants 

When the difficulties in transforming some plant species first became 
apparent, a number of procedures that could act as alternatives to transfor¬ 
mation by A. tumefaciens were developed (Table 18.1). A number of these 
methods require the removal of the plant cell wall to form protoplasts. 
Plant protoplasts can be maintained in culture as independently growing 
cells, or with a specific culture medium, new cell walls can be formed and 
whole plants can be regenerated. In addition, transformation methods that 
introduce cloned genes into a small number of cells of a plant tissue from 
which whole plants can be formed, thereby bypassing the need for regen¬ 
eration from a protoplast, have been developed. At present, most researchers 


FIGURE 18.9 Effect of lowering ethylene levels on the transformation of melon coty¬ 
ledons. Following transformation, the activity of the marker enzyme (3-glucuronidase 
was measured. Treatments: 1, no A. tumefaciens; 2, A. tumefaciens carrying the 
marker gene on a Ti plasmid; 3, A. tumefaciens carrying the marker gene on a Ti 
plasmid with aminoethoxyvinylglycine (AVG), a chemical inhibitor of ethylene 
synthesis, added to the system; 4, A. tumefaciens carrying the marker gene on a Ti 
plasmid and an ACC deaminase gene on a separate plasmid. 
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TABLE 18.1 Plant cell DNA-delivery methods 


Method 


Comment 


Ti plasmid-mediated gene 
transfer 
Microprojectile 
bombardment 
Viral vectors 
Direct gene transfer into 
plant protoplasts 
Microinjection 


Electroporation 
Liposome fusion 


An excellent and highly effective system that is 
limited to a few kinds of plants 
Used with a wide range of plants and tissues; easy 
and inexpensive 

Not an effective way to deliver DNA to plant cells 
Can be used only with plant cell protoplasts that 
can be regenerated into viable plants 
Has limited usefulness because only one cell can 
be injected at a time; requires the services of a 
highly skilled individual 

Generally limited to plant cell protoplasts that can 
be regenerated into viable plants 
Can be used only with plant cell protoplasts that 
can be regenerated into viable plants 


favor the use of either Ti plasmid-based vectors or microprojectile bom¬ 
bardment to deliver DNA into plant cells. A very large number of different 
plants have been genetically transformed with these various techniques 
(Table 18.2). 

Microprojectile Bombardment 

Microprojectile bombardment, also called biolistics, is the most important 
alternative to Ti plasmid DNA delivery systems for plants. Spherical gold 
or tungsten particles (approximately 0.4 to 1.2 pm in diameter, or about the 
size of some bacterial cells) are coated with DNA that has been precipitated 
with CaCl 2 , spermidine, or polyethylene glycol. The coated particles are 
accelerated to high speed (300 to 600 meters/second) with a special appa¬ 
ratus called a particle gun (or gene gun). The original version of the gene 
gun used a small amount of gunpowder to provide the propelling force. 
The device that is currently used employs high-pressure helium as the 
source of particle propulsion (Fig. 18.10). The projectiles can penetrate 
plant cell walls and membranes; however, the particle density used does 
not significantly damage the cells. The extent of particle penetration into 
the target plant cells may be controlled by varying the intensity of the 
explosive burst, altering the distance that the particles must travel before 
reaching the target cells, or using different-size particles. 

Once inside a cell, the DNA is removed from the particles and, in some 
cells, integrates into the plant DNA. Microprojectile bombardment can be 


TABLE 18.2 Plants that have been genetically transformed 


Alfalfa 

Carnation 

Kiwi fruit 

Papaya 

Potato 

Sunflower 

Apple 

Carrot 

Lettuce 

Pea 

Red fescue 

Sweet potato 

Arabidopsis 

Corn (maize) 

Licorice 

Peanut 

Rice 

Tall fescue 

Asparagus 

Cotton 

Lily 

Pear 

Rye 

Tobacco 

Banana 

Cranberry 

Lotus 

Pearl millet 

Sorghum 

Tomato 

Barley 

Cucumber 

Norway spruce 

Peony 

Soybean 

Wheat 

Bean 

Eggplant 

Oat 

Petunia 

Strawberry 

White spruce 

Cabbage 

Flax 

Orchard grass 

Plantain 

Sugar beet 


Canola 

Grape 

Orchid 

Poplar 

Sugarcane 







Genetic Engineering of Plants: Methodology 737 


used to introduce foreign DNA into plant cell suspensions, callus cultures, 
meristematic tissues, immature embryos, protocorms, coleoptiles, and 
pollen in a wide range of different plants, including monocots and conifers, 
plants that are less susceptible to Agrobacterium -mediated DNA transfer 
(Table 18.3). Furthermore, this method has also been used to deliver genes 
into chloroplasts and mitochondria, thereby opening up the possibility of 
introducing exogenous (foreign) genes into these organelles. 

Typically, plasmid DNA dissolved in buffer is precipitated onto the 
surfaces of the microprojectiles. Using this procedure, it is possible to 
increase the transformation frequency by increasing the amount of plasmid 
DNA; however, too much plasmid DNA can be inhibitory It is estimated 
that there are approximately 10,000 transformed cells formed per bombard¬ 
ment. With this technique, cells that appear to be transformed, based on the 
expression of a marker gene, often only transiently express the introduced 
DNA. Unless the DNA becomes incorporated into the genome of the plant, 
the foreign DNA will be degraded eventually 

The configuration of the vector that is used for biolistic delivery of 
foreign genes to plants influences both the integration and expression of 
those genes. For example, transformation is more efficient when linear 
rather than circular DNA is used. Moreover, large plasmids (>10 kb), in 
contrast to small ones, may become fragmented during microprojectile 
bombardment and therefore produce lower levels of foreign-gene expres¬ 
sion. Flowever, large segments of DNA may be introduced into plants 



Rupture disk 
Flying disk 


DNA-coated 
gold particles 

Stopping 

screen 

Evacuated 

chamber 


FIGURE 18.10 Schematic representation 
of a microprojectile bombardment ap¬ 
paratus. When the helium pressure 
builds up to a certain point, the plastic 
rupture disk bursts, and the released 
gas accelerates the flying disk with the 
DNA-coated gold particles on its lower 
side. The gold particles pass the stop¬ 
ping screen, which holds back the flying 
disk, and penetrate the cells of the 
sterile leaf. 


TABLE 18.3 Transgenic plants formed by microprojectile bombardment of various 
plant cells 


Plant(s) 

Cell source(s) 

Corn 

Embryonic cell suspension, immature zygotic embryos 

Rice 

Immature zygotic embryos, embryogenic callus 

Barley 

Cell suspension, immature zygotic embryos 

Wheat 

Immature zygotic embryos 

Turfgrass 

Embryogenic callus 

Rye 

Meristems 

Sorghum 

Immature zygotic embryos 

Pearl millet 

Immature zygotic embryos 

Orchid 

Protocorms 

Banana and plantain 

Embryonic cell suspension 

Poplar 

Callus 

Norway and 
white spruce 

Somatic embryos 

Pea 

Zygotic embryos 

Cucumber 

Embryogenic callus 

Sweet potato 

Callus 

Cranberry 

In vitro-derived stem sections 

Peony and lily 

Pollen 

Alfalfa 

Embryogenic callus 

Bean 

Zygotic embryos 

Cotton 

Zygotic embryos 

Grape 

Embryonic cell suspension 

Peanut 

Embryogenic callus 

Tobacco 

Pollen 


Adapted from Southgate et al., Biotechnol. Adv. 13:631-651, 1995. 
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using yeast artificial chromosomes (YACs) (see chapter 7). The YACs were 
engineered to contain plant selectable markers, as well as yeast selectable 
markers, which were already present on the YAC vector (Fig. 18.11). As a 
test system, various amounts of DNA from the fungus Cochliobolus het- 
erostrophus were cloned so that the total size of the engineered YAC ranged 
from 80 to 550 kb. Following biolistic transfer of the engineered YACs with 
fungal DNA to plant (tobacco) cells, a number of transformants resistant to 
the antibiotic kanamycin were isolated. These transformed plant cells were 
then tested for the presence of the second plant selectable marker gene 
(encoding resistance to the antibiotic hygromycin), which was located on 
the other arm of the YAC vector. The presence of both plant selectable 
marker genes in transformed plant cells indicated that the entire YAC, 
along with all of the inserted foreign DNA, was probably transferred. DNA 
hybridization experiments revealed that YACs up to 150 kb in total size 
have a good chance of being transferred to plant cells and that the trans¬ 
ferred DNA can be stably integrated into the plant cell. Thus, the produc¬ 
tion of transgenic plants that contain several foreign genes is feasible; 
eventually, entire biosynthetic pathways may be introduced into plant 
cells. 


Chloroplast Engineering 

While the vast majority of plant genes are found as part of the nuclear 
DNA, both the chloroplast and mitochondrion contain genes that encode a 
number of important and unique functions. Flowever, not all of the pro¬ 
teins that are present in these organelles are encoded by organellar DNA. 
Some chloroplast and mitochondrion proteins are encoded in the nuclear 
DNA, synthesized in the cell's cytoplasm, and then, by a special mecha¬ 
nism, imported into the appropriate organelle. Accordingly, there are two 
ways that a specific foreign protein can be introduced into the chloroplast 
or mitochondrion. In one way, a fusion gene encoding the foreign protein 
and additional amino acids that direct the transport of the protein to the 
organelle can be inserted into the nuclear chromosomal DNA, and after 
synthesis, the recombinant protein can be transported into the targeted 
organelle. In the other way, the gene for the foreign protein can be inserted 
directly into either the chloroplast or mitochondrial DNA. 

Most higher plants have approximately 50 to 100 chloroplasts per leaf 
cell, and each chloroplast has about 10 to 100 copies of the chloroplast DNA 
genome. Stable genetic transformation of chloroplasts in order to modify 
chloroplast functioning or to produce foreign proteins requires insertion of 
the foreign DNA into the chloroplast genome rather than into the much 


FIGURE 18.11 Schematic representation of a YAC vector used to transfer large pieces 
of DNA to plant genomes. TEL, telomere; SM, selectable marker; CEN, centromere. 
The various elements are not drawn to scale; the foreign DNA, especially, is much 
larger than shown. Each of the plant selectable marker genes contains its own pro¬ 
moter and transcription terminator (not shown). The plant selectable markers are a 
hygromycin resistance gene and a kanamycin resistance gene. Adapted from 
Mullen et al.. Mol. Breed. 4:449M57,1998. 


Yeast Plant Plant Yeast 

TEL SMI SMI CEN Foreign DNA SM2 SM2 TEL 
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Regeneration of Normal and Fertile Plants That Express 
Octopine Synthase from Tobacco Crown Galls after 
Deletion of Tumour-Controlling Functions 

H. De Greve, J. Leemans, J. P. Hernalsteens, L. Thia-Toong, 

M. De Beuckeleer, L. Willmitzer, L. Otten, M. Van Montagu, 

and J. Schell 

Nature 300:752-755,1982 


n efficient vector system is 
essential for the routine con¬ 
struction of transgenic plants. 
Most of the early efforts to develop 
such a vector focused on the Ti 
plasmid of the soil bacterium A. tume- 
faciens because, following infection of 
susceptible dicots, a portion of the Ti 
plasmid (the T-DNA) is inserted 
directly into the chromosomal DNA of 
the host plant cells. However, infec¬ 
tion of plants with the native Ti 
plasmid results in the formation of a 
crown gall tumor that interferes with 


the normal growth of the plant. 
Therefore, before the Ti plasmid could 
be used as a vector to transform 
plants, crown gall tumor formation 
had to be prevented. 

By studying the mRNAs that were 
transcribed from intact and modified 
T-DNAs, Schell and his colleagues 
determined that the tumor-inducing 
genes were a part of the T-DNA. 
Therefore, a modified T-DNA in which 
the tumor-inducing genes were 
deleted was constructed. This modi¬ 
fied T-DNA was then introduced by 


homologous recombination into the Ti 
plasmid. Such a "disarmed" Ti 
plasmid could be introduced into 
plant cells and transfer its T-DNA into 
their chromosomal DNA. Under these 
conditions, the modified T-DNA was 
stably maintained in the plant 
genome, and importantly, no crown 
gall tumors were formed. The next 
logical step in the development of this 
system was the cloning of foreign 
marker and target genes into the 
T-DNA so that they could also be 
transferred to the chromosomal DNA 
of the host plant. This vector system 
utilizing the Ti plasmid has become 
the system of choice for creating trans¬ 
genic plants and has been used suc¬ 
cessfully in thousands of laboratories 
around the world. 



larger chromosomal DNA. (Plant chromosomal DNA is generally around 
10 4 to 10 5 times larger than chloroplast DNA.) Moreover, the foreign DNA 
needs to be present in all of the approximately 10 3 to 10 4 chloroplast DNA 
genomes per leaf cell. 

Foreign DNA is typically introduced by microprojectile bombardment 
into the chloroplast genome on a plasmid vector with both the (usually 
nonselectable) foreign DNA and a selectable marker, such as an antibiotic 
resistance gene, flanked by specific chloroplast DNA sequences (Fig. 18.12). 
Homologous recombination is the normal mode of DNA integration into 
the chloroplast genome. 

Some chloroplast genes are transcribed by chloroplast-encoded RNA 
polymerase plus a nucleus-encoded RNA polymerase sigma factor, while 
others are transcribed solely by a nucleus-encoded RNA polymerase. The 
promoter sequences that are recognized by these two different RNA poly¬ 
merases are completely different. At present, it is not known how the use 
of one or the other type of promoter sequence affects the expression of the 
downstream genes. The efficient expression of foreign genes in the chloro¬ 
plast requires not only the use of an appropriate promoter sequence, but 
also the presence of the correct sequences in the 5' and 3' untranslated 
regions of the messenger RNA (mRNA). Many biotechnological applica¬ 
tions have focused on the strong sigma-70-type ribosomal RNA (rRNA) 
promoter, which is recognized by the chloroplast-encoded RNA poly¬ 
merase. This promoter is fused at the DNA level with chloroplast transla¬ 
tional control sequences, followed by the gene of interest and a 3' 
chloroplast untranslated region containing a stem-and-loop structure, 
which may act as a transcription termination signal (Fig. 18.13). Any 
nucleus-encoded proteins that are synthesized outside of the chloroplast 
contain a transit peptide that is removed as the protein is imported into the 
chloroplast. 
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Foreign gene Spc r gene 



FIGURE 18.12 Plasmid vector used for integrating a foreign gene and a marker gene 
into the chloroplast genome. Regulatory sequences are not shown. Homologous 
recombination can occur between chloroplast DNA sequences on the vector and the 
chloroplast genome. Spc r , spectinomycin resistance. 


Chloroplast DNA is inherited in a non-Mendelian fashion. In most 
angiosperm plants, it is maintained in egg cells, but not sperm (pollen) 
cells, and is therefore transmitted uniparentally by the female. Thus, pollen 
cannot transmit the contents of the chloroplast genome to the zygote. This 
has practical importance, since this trait could prevent the spread of foreign 
genes, localized in the chloroplast, through pollen to neighboring plants, 
thereby addressing one of the concerns of critics of the genetic engineering 
of plants. 

An important component of a successful chloroplast transformation 
system is the availability of suitable selectable marker and reporter genes 
that facilitate selection and analysis of cells with transgenic chloroplasts. A 
number of selectable marker genes are currently available for monitoring 
chloroplast transformation (Table 18.4). Most transformed chloroplasts are 
selected by resistance to spectinomycin, streptomycin, or kanamycin, all of 
which inhibit protein synthesis on prokaryotic-type ribosomes. These anti¬ 
biotics inhibit greening, faster proliferation, and shoot formation. However, 
plant cells with transformed chloroplasts that express the genes conferring 
resistance to the antibiotic are readily identified in the presence of (other¬ 
wise) inhibitory antibiotics by their greening, faster proliferation, and shoot 
formation. Since only one, or at most a few, of the chloroplasts actually 
incorporates the foreign DNA, repeated rounds of growth on selective anti¬ 
biotics are often required so that the chloroplast population becomes 
enriched and eventually dominated by the transformants. Eventually, all 
chloroplast genomes that do not carry the transgenic DNA with its select¬ 
able marker gene are lost. 

The chloroplast genome is an attractive location for engineering any 
gene that, as a consequence of the multiple copies of the chloroplast DNA, 
might benefit from high levels of expression. For example, active human 
somatotropin accumulated to 7% of the total soluble protein in tobacco 
chloroplasts. In addition, most chloroplast-borne genes are organized into 
operons and produce polycistronic mRNAs. This should make it easier to 


FIGURE 18.13 Organization of an engineered chloroplast gene transcription cassette. 
UTR, untranslated region. 
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TABLE 18.4 Some foreign genes that have been used as selectable markers and 
reporters of transgenic chloroplasts 


Gene 

Gene product 

Function 

aadA 

Aminoglycoside 

3'-adenylyltransferase 

Positive selection (spectinomycin 
and streptomycin resistance) 

nptll 

Neomycin phosphotransferase 

Positive selection (kanamycin 
resistance) 

uidA 

p-Glucuronidase 

Reporter gene 

8fP 

Green fluorescent protein 

Reporter gene 

codA 

Cytosine deaminase 

Negative selection (5-fluorocytosine 
sensitivity) 


Adapted from Hager and Bock, Appl. Microbiol. Biotechnol. 54:302-310, 2000. 


engineer plants with transgenic chloroplasts that express several genes that 
are regulated together as part of a new metabolic pathway than to coordi- 
nately express several genes under the control of different promoters that 
have been introduced into a plant's chromosomal DNA. 

Chloroplasts belong to a group of plant organelles called plastids, 
which contain approximately 120 to 180 kb of circular double-stranded 
DNA bounded by a double membrane. Plastids include amyloplasts, which 
contain starch grains; chloroplasts, which contain chlorophyll; elaioplasts, 
which contain oil; and chromoplasts, which contain other pigments. While 
photosynthetic tissues, such as green leaves, contain chloroplasts, other tis¬ 
sues contain other types of plastids. For example, tomato fruit contains a 
large number of chromoplasts. Thus, in much the same way that foreign 
DNA may be expressed as a part of the chloroplast genome, it can also be 
targeted for expression in the chromoplast. 

Researchers can transform tomato plastids and obtain high-level expres¬ 
sion of foreign proteins both in green leaves and in tomato fruit. One of the 
advantages of this system is that transgenic tomatoes expressing high levels 
of certain foreign proteins (which are normally found as part of an animal 
or human pathogen) may be used as edible vaccines (see chapter 20). 


Use of Reporter Genes in Transformed Plant Cells 

It is essential to be able to detect the foreign DNA that has been integrated 
into plant genomic DNA so that those cells that have been transformed can 
be identified; this requires the use of a selectable marker. Furthermore, in 
studies of plant transcriptional regulatory signals and the functioning of 
these signals in specific plant tissues (such as leaves, roots, and flowers), it 
is often important to be able to quantify the level of expression of a gene 
with a readily identified product. Quantification and other applications 
require the use of reporter genes that encode an activity that can be assayed. 
To these ends, a number of different genes have been tested as reporters for 
transformation, including genes that can be used as dominant selectable 
markers and genes whose proteins produce a detectable response to a spe¬ 
cific assay (Table 18.5). Many of these reporter genes are from bacteria and 
have been equipped with plant-specific regulatory sequences for expres¬ 
sion in plant cells. Dominant-marker selection provides a direct means of 
obtaining only transformed cells in culture. For example, in the presence of 
the antibiotic kanamycin, only plant cells with a functional neomycin phos¬ 
photransferase gene can grow. 
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TABLE 18.5 Plant cell reporter and selectable marker gene systems 


Enzyme activity 

Selectable 

marker 

Reporter 

gene 

Neomycin phosphotransferase 

Yes 

Yes 

Hygromycin phosphotransferase 

Yes 

Yes 

Dihydrofolate reductase 

Yes 

Yes 

Chloramphenicol acetyltransferase 

Yes 

Yes 

Gentamicin acetyltransferase 

Yes 

Yes 

Nopaline synthase 

No 

Yes 

Octopine synthase 

No 

Yes 

p-Glucuronidase 

No 

Yes 

Streptomycin phosphotransferase 

Yes 

Yes 

Bleomycin resistance 

Yes 

No 

Firefly luciferase 

No 

Yes 

Bacterial luciferase 

No 

Yes 

Threonine dehydratase 

Yes 

Yes 

Metallothionein II 

Yes 

Yes 

efzo/-Pyruvylshikimate-3-phosphate synthase 

Yes 

No 

Phosphinothricin acetyltransferase 

Yes 

Yes 

p-Galactosidase 

No 

Yes 

Blasticidin S deaminase 

Yes 

Yes 

Acetolactate synthase 

Yes 

No 

Bromoxynil nitrilase 

Yes 

No 

Green fluorescent protein 

No 

Yes 


Adapted from Walden and Schell, Eur. J. Biochem. 192:563-576,1990, and Gruber and Crosby, p. 89-119, 
in B. R. Glick and J. E. Thompson (ed.). Methods in Plant Molecular Biology and Biotechnology (CRC Press, Boca 
Raton, FL, 1993). 


The desired outcome of a particular experiment often dictates which 
reporter gene will be used. Clearly, when the expression of a reporter gene 
interferes with normal plant functions, it cannot be used. Moreover, the 
presence of some reporter genes and their products may taint a commercial 
product. In this context, it is best to remove the reporter gene once transfor¬ 
mants with the desired traits have been selected, especially in crop plants. 

Some reporter gene products, e.g., (3-D-glucuronidase (GUS), both 
firefly and bacterial luciferases, and green fluorescent protein (GFP), can be 
detected in situ in intact plant tissues. One of the most popular of these 
systems is the E. coli GUS gene. The GUS gene encodes a stable enzyme that 
is not normally present in plants and that catalyzes the cleavage of a range 
of p-D-glucu ronides. The GUS activity in transformed plant tissues can be 
localized by the presence of a blue color that is formed after the hydrolysis 
of the uncolored substrate 5-bromo-4-chloro-3-indolyl p-D-glucu ronic acid. 
Alternatively, GUS activity in plant extracts can be quantitatively and sen¬ 
sitively assayed by a fluorometric analysis that involves the hydrolysis of 
the substrate 4-methylumbelliferyl P-D-glucuronide to form a fluorescent 
product. 

GFP is an ideal in vivo marker for monitoring transgenic plants because 
it fluoresces green when excited with either ultraviolet or blue light and 
does not require the addition of any substrates or cofactors. Normally, 
wild-type plants (which do not contain the gene for GFP) fluoresce reddish 
purple when they are excited by ultraviolet or blue light. Moreover, deriva¬ 
tives of the gene that encodes the GFP protein that have different spectral 
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properties and increased levels of fluorescence have been constructed. 
Importantly, the expression of GFP has no effect on the survival and growth 
of transformed plants under field conditions. 


Manipulation of Gene Expression in Plants 

When genetic transformation of plants became routine, research efforts 
were directed toward introducing a wide range of plant and bacterial genes 
in plant cells. The transformed plants were assayed for the production of the 
foreign protein and studied physiologically to assess how the presence of an 
additional protein affected the whole plant. Many of these early experi¬ 
ments utilized promoters that were expressed constitutively in a range of 
plant cells. More recently, many additional plant promoters have been iso¬ 
lated and characterized and used to express foreign proteins in specific cells 
at certain times during the growth and development of the plant. For 
example, instead of the strong constitutive 35S promoter from cauliflower 
mosaic virus, which is expressed in all plant tissues and throughout the life 
of the plant, researchers have used the promoter for the small subunit of the 
photosynthetic enzyme ribulosebisphosphate carboxylase, which is active 
only in photosynthetic tissues, such as leaves. Similarly plant promoters 
active only in specific tissues, such as roots or flowers, or only during 
periods of environmental stress—e.g., the pathogenesis-related promoters— 
have been used to control the expression of some foreign genes. 

Isolation and Use of Different Promoters 

In order to minimize any deleterious effects on plant growth from the 
expression of foreign genes, it is necessary to regulate the expression of 
introduced genes, spatially and temporally, and the amount of the foreign 
protein that is produced. While a major factor in the initiation of transcrip¬ 
tion in plants is the binding of RNA polymerase II to the promoter 
sequence, other factors affect this process. Much of the specificity of tran¬ 
scription in plants is controlled by sequence-specific transcription factors 
and/or enhancer-binding proteins (Fig. 18.14). Enhancer elements are 
regions on the DNA that bind to enhancer-binding proteins, which interact 
with transcription factors and RNA polymerase to maximize the level of 
transcription from a particular promoter. Enhancer sequences may be 
located several thousand base pairs from the promoter sequence, although 
they are typically much closer. Enhancer sequences are generally consid¬ 
ered to be important determinants of the tissue and temporal specificity of 
gene expression. While the importance of promoter and enhancer sequences 
has been recognized, a thorough understanding of how these and other 
elements regulate plant gene expression is still at an early stage. Despite the 
complexity of this system, there have been several successful experiments 
in which "designer promoters" were created. 

Specialized vectors, called promoter-tagging (labeling) vectors, have 
been used to isolate plant promoters from several plant species. This 
approach relies on the Agrobacterium -mediated Ti plasmid transformation 
system. Briefly a promoterless reporter gene is placed next to the right 
border of a Ti plasmid vector. After transfer of the T-DNA into a plant chro¬ 
mosome, the reporter gene from the vector is inserted randomly into the 
plant DNA (Fig. 18.15A). If the T-DNA is inserted immediately downstream 
of the promoter region of a functional gene, transcription of the reporter 
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FIGURE 18.14 Schematic representation of a plant RNA transcription complex 
including RNA polymerase II binding to the TATA box (an essential portion of a 
eukaryotic promoter), various transcription factors bound to the polymerase, and 
an enhancer-binding protein bound to an enhancer element on the DNA, which is 
located at some distance from the promoter sequence. 


gene occurs. For example, with the neomycin phosphotransferase (npt) gene 
as a reporter, its expression is detected by selecting kanamycin-resistant 
transformants. However, with this method, only constitutive promoters will 
be selected. Thus, it is difficult to identify (tag) a promoter that is active only 
during a certain developmental stage or that is induced by a specific envi¬ 
ronmental factor because insertion downstream from this type of promoter 
will be selected against using kanamycin. To overcome this problem, a two- 
gene selectable marker system was devised. In this case, a hygromycin 
resistance gene is placed under the control of a constitutive promoter next 
to a promoterless reporter gene within the T-DNA (Fig. 18.15B). After 
hygromycin-resistant transformants are selected, the transformants can be 
checked by an enzyme assay that measures neomycin phosphotransferase 
activity under different conditions for expression to identify potentially 
useful plant promoters. With this strategy, 5 to 30% of the transformed plant 
cells have the reporter gene under the control of an active promoter. 

The cauliflower mosaic virus 35S promoter is frequently used as a 
strong promoter in plant systems, although the level of expression of a 
foreign protein under the control of this promoter is often lower than 
desired. To address this problem, it is necessary to test different promoter- 
gene constructs in plants to see if more effective promoters can be found. 
In addition to the promoter, several other elements may enhance foreign- 
gene expression. As indicated above, these include enhancer sequences 
that are typically found from one to several hundred nucleotides upstream 
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FIGURE 18.15 Using a promoterless reporter gene to isolate a plant promoter. (A) A 
promoterless neomycin phosphotransferase (npt) reporter gene is placed down¬ 
stream from a right-border (RB) sequence of the T-DNA. After transfer of the 
T-DNA into the plant chromosomal DNA, if the T-DNA is inserted near the pro¬ 
moter (P) region of a functional plant gene that is oriented in the same direction as 
npt, transcription occurs. The expression of npt is detected by selecting for kana- 
mycin-resistant transformants. (B) To ensure that transformed cells are selected, a 
hygromycin resistance (Hyg r ) gene, under the control of a constitutive promoter (P), 
is placed downstream from the promoterless reporter gene within the T-DNA. Both 
the npt and Hyg r genes are equipped with transcription terminator (TT) regions. 


of the promoter sequence, introns that may stabilize mRNA, and transcrip¬ 
tion terminator sequences. 

In one series of experiments, DNA constructs that contained all or 
some of the following elements were tested: the 35S promoter, the nopaline 
synthase gene transcription terminator, from one to seven tandemly 
repeated enhancer elements, and a DNA sequence from tobacco mosaic 
virus called Q. (omega) that increases gene expression at the translational 
level (Fig. 18.16). The most active construct contained seven enhancer ele¬ 
ments and directed a much higher level of foreign-gene expression in both 
transgenic tobacco and rice plants than the 35S promoter alone (Table 18.6). 
The expression levels of foreign genes from these promoter constructs were 
quite variable in transgenic plants. This variation is thought to be due to 
the site within the plant genome where the T-DNA is inserted. Nevertheless, 
this work shows that it is possible to engineer promoters that are much 
stronger than the naturally occurring 35S promoter. With this approach, it 
should be possible to engineer promoters that are tissue specific, develop- 
mentally regulated, and strong. 

Gene Targeting 

In bacteria, and to a lesser extent in animal cells, it is relatively straightfor¬ 
ward to alter the genomic DNA of an organism by homologous recombina¬ 
tion between the native form of the target DNA in the genome and a 


FIGURE 18.16 Example of a composite promoter assembled from a variety of different 
elements, including enhancer elements (only one is shown; however, the most 
active composite promoter contained seven), a 35S promoter, a 5' untranslated 
region (UTR) containing an Q, sequence, the gene of interest, and a 3' UTR con¬ 
taining a transcriptional terminator sequence. 
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TABLE 18.6 Testing promoter constructs in transgenic plants 



Average gene expression 

Maximum 

gene expression 

Plant 

35S 

promoter 

Composite 

promoter 

35S 

promoter 

Composite 

promoter 

Tobacco 

1.0 

2.8 

2.8 

18.3 

Rice 

1.0 

14.4 

7.2 

47.1 


Adapted from Mitsuhara et al., Plant Cell Physiol. 37:49-59,1996. 

E. coli |3-glucuronidase was the reporter gene. Enzyme activities were normalized to the average value 
per plant when the gene was under the control of the 35S promoter. The actual values in tobacco are 
approximately 30-fold higher than the values in rice. The composite promoter included the 35S promoter, 
the nopaline synthase gene transcription terminator, seven tandemly repeated enhancer elements, and the 
tobacco mosaic virus £1 sequence. Average gene expression is the mean value of a number of different 
transgenic plants. Maximum gene expression is the highest value observed in any transgenic plant with 
that promoter. 


modified form of the target DNA, usually on a plasmid vector (see chapter 
6). Using similar techniques, the targeted alteration of plant cell genes 
occurs quite infrequently. However, based on successful experiments in 
which genomic changes were introduced into animal cells, researchers 
have used RNA-DNAchimeric molecules—actually, the RNAis 2'-0-methyl 
RNA—to stably change the genomic DNA of plant cells (Fig. 18.17). These 
chimeric oligonucleotides are designed to have one or more bases that do 
not pair with the endogenous plant DNA sequence. Following the delivery 
of a chimeric oligonucleotide into a plant cell by microprojectile bombard¬ 
ment, it is thought that DNA repair enzymes recognize the mismatches 
between the targeted gene and a large molar excess of the chimeric oligo¬ 
nucleotide. During the repair process, the altered DNA is incorporated into 
the plant genome. The changed chromosomal DNA can be readily detected 
phenotypically if the mutation that is created is dominantly or codomi- 
nantly expressed. This is because plants are diploid, with two copies of 
each gene, and this procedure typically changes only one of those copies. 
In addition to changing one or two bases in the sequence of a plant gene, 
this technique may also be used to modify plant DNA through the site- 
specific insertion or deletion of a single base. 

Gene targeting is extremely inefficient in plants, where the frequency 
of random DNA integration into the plant genome generally exceeds that 
of homologous integration by 3 or 4 orders of magnitude. Moreover, 
because the transformation frequency in plants is often around 1 to 5%, the 
number of seeds or plant cell cultures that must be screened to detect a 
single homologous-integration event is around 10 4 to 10 6 . However, in 


FIGURE 18.17 Example of a chimeric oligonucleotide used to change the nucleotide 
sequence of plant DNA. The 2'-0-methyl RNA residues are shown in lowercase red 
letters. The DNA residues are shown in uppercase letters. The codon that was tar¬ 
geted for change is shown in blue. In this case, a CCA codon (which encodes the 
amino acid proline) that is part of a tobacco acetolactate synthase (ALS) gene was 
changed to CAA—this is the complementary sequence of the 3'-GTT-5' sequence 
shown in blue—which encodes glutamine. Plants with the mutated gene may be 
selected for by their resistance to the toxic effects of sulfonylurea herbicides, which 
normally target the ALS enzyme. 
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plants that have been engineered to overexpress the RAD54 gene from the 
yeast Saccharoviyces cerevisiae, the gene-targeting frequency of subsequently 
introduced cloned genes increases by up to 2 orders of magnitude (Fig. 
18.18). The yeast RAD54 gene promotes integration of homologous intro¬ 
duced DNA into the plant genomic DNA. Transgenic plants that express 
RAD54 to a high level are readily selected on the basis of their resistance to 
high levels of y-irradiation that are lethal to nontransformed plants. This 
resistance occurs as a consequence of the increased recombination repair 
efficiency conferred by RAD54. Initial experiments with plants that overex¬ 
press RAD54 utilized a GFP gene, whose product is easily detected, that 
was inserted in frame in the middle of the target gene. Fiowever, in experi¬ 
ments in which the gene of interest is not marked by GFP, it may be neces¬ 
sary to screen large numbers of transformants to find the target gene with 
the desired modification. Nevertheless, the use of RA D54-tra nsgen ic plants 
dramatically increases the probability of targeting specific changes in plant 
genes. The RAD54 gene may be deleted from the plant genome, prior to the 
agricultural use of the modified plant, by traditional genetic crosses with 
the wild-type plant, selecting for plants that contain the desired altered 
gene but not the RAD54 gene. 

Consumers and regulators in a number of countries have expressed 
serious concerns about the commercialization of transgenic plants. Fiowever, 
modification of an existing plant gene, which is generally not considered to 
be genetic engineering, should allow some specifically altered plants to 
more rapidly reach the marketplace in those countries. The targeted stable 
modification of plant genomes is conceptually similar to conventional 
mutagenesis and selection procedures, and with this approach, no foreign 
DNA is introduced into the plant. 

Targeted Alterations in Plant RNA 

To modify the phenotypes of plants, researchers have directed their efforts 
toward the downregulation of the expression of certain plant genes. This 
inhibition of mRNA expression can be achieved by the expression of an 
additional copy of a gene (using a mechanism originally called cosuppres¬ 
sion and currently thought to involve RNA interference), addition of an 
antisense version of a gene, or the use of ribozymes, small RNA molecules 
with the ability to act as sequence-specific endoribonucleases (for a discus- 


FIGURE 18.18 Constructs used in gene targeting. (A) The construct used to introduce 
the RAD54 gene includes a left border (LB) and right border (RB), a glufosinate (a 
herbicide) resistance gene with appropriate regulatory signals (BASTA) for selec¬ 
tion, and the S. cerevisiae RAD54 gene under the control of the cauliflower mosaic 
virus 35S promoter (P 35S ) and transcription termination (TT) signals. (B) The con¬ 
struct containing the target gene and its regulatory signals. 
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sion of RNA interference, antisense RNA, and ribozymes, see chapter 11). 
When they are present in a cell, ribozymes are able to recognize and cleave 
one specific mRNA of the many that are present in that cell, thereby 
decreasing the amount of protein that is encoded by that mRNA. In one 
experiment, a number of potential ribozyme cleavage sites were identified 
in the corn stearoyl-acyl carrier protein (ACP) A9 desaturase mRNA 
sequence, and then 20 different hammerhead ribozymes were synthesized 
and tested. Genes for the three most effective ribozymes were fused to the 
open reading frame of a functional plant structural gene to enhance 
ribozyme stability, and then each of the constructs was used to generate 
transgenic plants. A few of the transgenic plants demonstrated a reduction 
in stearoyl-ACP A9 desaturase mRNA and protein levels and an increase in 
the plant's stearate content (Fig. 18.19). The oil from plants that have a high 
stearic acid content is used in cooking and to make margarine. In order to 
inhibit translation, it is not sufficient for a ribozyme to merely base pair 
with mRNA. The biological activity of the ribozyme is dependent on its 
cleaving the target mRNA. 

Facilitating Protein Purification 

Transgenic plants have some advantages over bacteria as expression sys¬ 
tems for foreign proteins. For example, transgenic bacteria are typically 
grown in expensive bioreactors under precisely defined conditions and 
require highly skilled personnel to oversee the entire growth process. On 
the other hand, it is generally thought that plants may be grown in the field 
by somewhat less highly skilled individuals at a much lower cost and in 
more or less unlimited quantities. Thus, one of the main advantages of 
expressing a foreign protein in a transgenic plant is that it is relatively inex¬ 
pensive to grow plants on a large scale. 

With both plants and bacteria, following growth, the organisms must 
be harvested and processed before the recombinant protein can be purified. 
Protocols for the purification of proteins from transgenic bacteria are well 
established, and many proteins produced by bacteria are in the market¬ 
place. However, transgenic plants produce a much lower level of foreign 
proteins; therefore, for commercial production of a range of protein prod¬ 
ucts, the purification of target proteins produced by transgenic plants 
requires special strategies and approaches. 

Oleosins. A novel way to facilitate recombinant plant protein purification 
is to fuse the foreign protein to plant oleosins. Oleosins, or oil body pro¬ 
teins, are found in the seeds of a wide range of plants. These proteins are 
quite hydrophobic and are mostly embedded within small oil droplets (0.6 
to 2 pm in diameter) called plant oil bodies, thus stabilizing the oil bodies 

FIGURE 18.19 Modification of fatty acid biosynthesis in maize. The conversion of 
stearic to oleic acid is blocked by the action of the added ribozyme on the mRNA 
encoding stearoyl-ACP A9 desaturase, and stearic acid accumulates. The numbers 
to the left of the colons indicate the numbers of carbon atoms in the fatty acid, and 
the numbers to the right of the colons indicate the numbers of C=C bonds, which 
are formed by the activity of desaturases. CoA, coenzyme A. 
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as discrete organelles. However, the N- and C-terminal regions of oleosins 
are more hydrophilic than the rest of the protein and are exposed to the 
aqueous environment. It is therefore possible to engineer fusions between 
oleosins and water-soluble proteins at the DNA level (Fig. 18.20), with the 
expectation that the fusion protein will be targeted to plant oil bodies, 
making it relatively easy to purify. In this case, the water-soluble target 
protein will not be embedded in the oil body but rather will be exposed to 
the aqueous environment. Since it is relatively easy to purify oil bodies 
from plant seeds, the purification of the recombinant protein is simplified. 
A cleavable linker is included between the oleosin and the target protein so 
that the recombinant protein can be recovered by cleavage of the fusion 
protein. Expression in seeds is particularly attractive, as proteins accumu¬ 
late stably and seeds can be stored without deterioration prior to being 
processed. This system has the potential to significantly lower the costs of 
purifying target proteins produced in plants. 

Rhizosecretion. Harvesting the variety of protein products that can be 
produced in transgenic plants can be difficult, since they are generally 
localized within plant cells that must be disrupted before their purification. 
Moreover, since the cost of purifying a recombinant protein can be as much 
as 90% of the total cost of producing that protein, if plants are going to be 
used as bioreactors for protein production, it is essential that purification 
costs be kept to a minimum. One way around this problem is to engineer 
plants to secrete foreign proteins through the roots in a process that has 
been called "rhizosecretion." If a plant engineered for rhizosecretion is 
grown hydroponically, the protein will be secreted directly into the culture 
medium (Fig. 18.21). 

Normally, roots secrete large amounts of small molecules, such as 
sugars and amino acids; however, they secrete only low levels of relatively 
few proteins. These small organic molecules, including mainly amino acids 
and sugars, are first secreted to the root intercellular space (apoplast) before 
they are exuded by the roots. In one series of experiments, three different 
proteins—xylanase from the thermophilic bacterium Clostridium thermo- 
cellum, GFP from the jellyfish Aequorea victoria, and human placental secreted 
alkaline phosphatase—were tested to determine whether they could be 
engineered for secretion through the roots. The three proteins were directed 
to the root apoplast using three different secretion signals (Fig. 18.22). Each 
protein was efficiently exuded by the roots of transgenic tobacco plants, as 
long as the genetic construct contained a DNA fragment encoding a signal 
peptide, even one that was not of plant origin, placed upstream (at the 5' 
end) of the gene whose protein was targeted for secretion. Both the 35S pro¬ 
moter, which is expressed in all plant cell types, and the mas2' promoter, 
which is preferentially expressed in roots, directed the synthesis of a sig¬ 
nificant amount of the target protein in root tissue. With the 35S promoter, 
the foreign protein could also be recovered from the guttation fluid (i.e., leaf 
exudate). However, given the ease of collecting root as opposed to leaf exu¬ 
date, this approach appears to have the most promise at the present time. 

Glycosylation. A large number of mammalian proteins, including many 
potentially therapeutic molecules, are glycosylated (i.e., they contain spe¬ 
cific sugars attached to the hydroxyl group of either serine or threonine or 
the amide group of asparagine). While the addition of glycans (polysac¬ 
charides) containing high levels of mannose at specific asparagine residues 



Oil body 

FIGURE 18.20 Schematic representation 
of a fusion protein including oleosin 
and a water-soluble target protein 
embedded in a plant seed oil body. The 
N- and C-terminal ends of the oleosin 
and the target protein are hydrophilic 
and are therefore found in the aqueous 
environment. 


FIGURE 18.21 Schematic representation 
of a plant in hydroponic culture 
secreting proteins and small molecules 
(red arrows) into the medium. The 
arrows at the inlet and outlet ports 
indicate the direction of flow of added 
nutrient solutions. The proteins secreted 
by the roots are concentrated and har¬ 
vested from the hydroponic medium 
and then purified. 
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FIGURE 18.22 DNA constructs of pro¬ 
teins engineered for secretion into the 
apoplast in tobacco. (A) A truncated 
xylanase (xyl-t) gene from C. thermo- 
cellum is transcriptionally controlled by 
the cauliflower mosaic virus 35S pro¬ 
moter (P 35S ) and targeted to the apo¬ 
plast by a tobacco proteinase inhibitor 
signal peptide (SP pn ). (B) A GFP (gfp) 
gene from A. victoria controlled by the 
strong modified Agrobacterium man- 
nopine synthase promoter (p mas2 ') that is 
preferentially active in plant roots and 
targeted to the apoplast by a Nicotiana 
plumbaginifolia (another type of tobacco) 
calreticulin proteinase inhibitor signal 
peptide (SP col ). (C) A truncated human 
alkaline phosphatase (alk) gene con¬ 
trolled by the strong modified 
Agrobacterium mannopine synthase 
promoter (p mas2 ') that is preferentially 
active in plant roots and targeted to the 
apoplast by the human alkaline phos¬ 
phatase signal peptide (SP fl!jt ). 


on proteins is initially identical in mammalian and plant cells, trimming of 
the sugar residues generates complex N-glycans with very different struc¬ 
tures and properties in the two different types of organisms. Although dif¬ 
ferences in glycosylation may not directly alter the activity of a protein, 
other properties, such as folding, stability, solubility, susceptibility to pro¬ 
teases, blood clearance rate, and antigenicity, can be affected profoundly. 

To avoid some of the problems that result from the incorrect glycosyla¬ 
tion of mammalian proteins that are produced in plant cells, it is possible 
to modify the plant so that it does not add "problematic" carbohydrate 
residues. For example, when the aquatic plant Levina minor was engineered 
to produce light and heavy chains that are part of a human monoclonal 
antibody, the plant was simultaneously transformed with an RNA interfer¬ 
ence construct (see chapter 11) that specifically inhibited the expression of 
the plant enzymes a-l,3-fucosyltransferase and (3-1,2-xylosyltransferase. 
The antibodies that were produced by these transgenic plants contained a 
single major N-glycan species without any plant-specific N-glycan residues 
(Fig. 18.23). While these truncated glycan side chains are not identical to 
the glycans that are produced when a human monoclonal antibody is syn¬ 
thesized in Chinese hamster ovary (CHO) cells in culture, they share the 
same core molecules. When human monoclonal antibodies produced in 
genetically modified plants were tested, they displayed better antibody- 
dependent cell-mediated cytotoxicity and effector cell receptor-binding 
activities than antibodies that were produced in CHO cells. Thus, with this 
system, it may be possible to produce a wide range of mammalian proteins 
in which the glycan side chains no longer limit the use of the final protein. 
In addition, it should be much less expensive to produce mammalian pro¬ 
teins in transgenic plants than in mammalian cell culture. 


Production of Marker-Free Transgenic Plants 

Usually, at the same time that a foreign gene is introduced into plants, a 
selectable marker gene is also introduced. Although none of these genes or 
their products have been shown to have an adverse effect on human. 


FIGURE 18.23 Structures of the N-glycan side chains found on proteins synthesized 
by CHO cells, the native form of the aquatic plant L. minor, or a genetically modified 
form of L. minor. 
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FIGURE 18.24 Schematic representation of cotransformation of a plant cell with two 
separate DNAs. One DNA carries a selectable marker, while the other carries the 
target gene. On average, more than half of the transformed plants have both of 
these genes, but at separate sites on the chromosome. Several rounds of self-mating 
allow the segregation of the marker and target genes into different plants. 


animal, or environmental safety, their inclusion in transgenic plants has 
raised some concerns. For example, it is possible that the products of some 
marker genes might be either toxic or allergenic. Also, the antibiotic resis¬ 
tance genes that are used as selectable markers might be transferred to 
pathogenic soil microorganisms. Moreover, the presence of a selectable 
marker makes it technically more difficult to transform a transgenic plant 
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with additional foreign genes, since the same selectable marker cannot be 
used more than once. To allay these concerns, strategies for the production 
of transgenic plants without any marker genes have been developed. 

Removing Marker Genes from Nuclear DNA 

One experimental approach that has been used to produce markerless 
transgenic plants includes cotransformation of plants with two separate 
DNAs, one carrying the marker gene and the other carrying the target for¬ 
eign gene. With cotransformation, approximately 30 to 80% of the trans¬ 
formed plants contain both genes. However, since the two genes are 
integrated at different sites on the chromosomal DNA, traditional breeding 
techniques can be used to rid the transgenic plant of the selectable marker 
(Fig. 18.24). Thus, the two genes are separated by chromosome segregation 
during a few rounds of matings. 

Alternatively, a selectable marker gene is cloned between plant trans- 
posable elements (Ds elements) and inserted into the T-DNA along with 
the target gene and a transposase gene that excises the DNA between the 
Ds elements and integrates it at another chromosomal site (Fig. 18.25). Any 
sequence that is between two Ds elements can be mobilized to a new loca¬ 
tion in the genome—excising the gene from the original location—provided 
that the appropriate transposase enzyme is available. During insertion of 
the T-DNA into the host plant DNA, about 90% of the time the selectable 
marker that is between the two Ds elements will be moved to another site 
on the chromosomal DNA. About half of the time the new location for the 
selectable marker will be far away from the original location. Thus, a select¬ 
able marker gene can be used to identify transformed plant cells, and sub¬ 
sequently, it can be removed by breeding. 

Any procedure that uses sexual crossing to segregate the selectable 
marker from the gene of interest cannot be applied to woody plants 
(because of their long generation times), vegetatively propagated plants, or 
sterile plants. In addition, if several related genes are introduced, and these 
genes are unlinked, they may segregate independently and consequently 
be lost in subsequent generations. One simple way around this problem is 
to utilize DNA sequences that flank the selectable marker and mark it for 
eventual removal from the genome (Fig. 18.26). In this way, the selectable 
marker and flanking sequences are only transiently present as a component 
of the transgenic plant. For example, if the selectable marker gene is fused 
to a recombinase gene and flanked by sequences that are recognized by the 
recombinase, both from the yeast Zygosacclmromyces rouxii, the selectable 
gene will be initially expressed. However, the recombinase is also expressed 
and mediates excision of the region between its recognition sites, thus 
removing both the selectable marker and recombinase genes after growth 


FIGURE 18.25 Schematic representation of a T-DNA-based selectable marker gene 
excision system. Following integration of the T-DNA into the plant chromosomal 
DNA, the transposase can excise the selectable marker gene and insert it into a dif¬ 
ferent chromosomal location. LB, the T-DNA left border; RB, the T-DNA right 
border. The promoter and transcription termination sequences of the genes are not 
shown. 
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LB Recombinase gene Selectable marker Target gene RB 



sequence 

FIGURE 18.26 Schematic representation of a DNA construct for generating marker- 
free transgenic plants. RB and LB are the right and left borders of the T-DNA, 
respectively. After transformation, selection of transformed cells, and growth in 
culture, recombinase cleaves the DNA at the recombination sites, and the selectable 
marker and recombinase genes are lost. Transgenic plants are formed from the cells 
without the selectable marker gene. 


in culture for several months (Fig. 18.26). In this case, transformed plant 
cells are selected immediately following transformation, characterized to 
ascertain whether they contain and express the target gene, and then 
grown in culture until the selectable marker has been excised. In addition, 
several similar excision systems that result in the removal of the selectable 
marker have been developed. Importantly, regardless of the details, it is no 
longer necessary to produce transgenic plants that retain antibiotic resis¬ 
tance genes or other undesirable selectable markers. 

Most transgenic plants are selected using any one of a small number of 
antibiotic or herbicide resistance genes. An alternative to utilizing these 
genes includes expressing the D-amino acid oxidase (DAO) gene from the 
yeast Rhodotorula gracilis in transgenic plants. This enzyme catalyzes the 
deamination of several D-amino acids that might otherwise become inhibi¬ 
tory to plant growth. Thus, for example, nontransgenic plants (which lack 
this enzyme) are inhibited in the presence of D-alanine despite the fact that 
they can grow normally in the presence of D-isoleucine (Fig. 18.27). On the 
other hand, plants that have been transformed to express the DAO gene 
grow normally in the presence of D-alanine but are inhibited by D-isoleu- 
cine. An innovative way to utilize this selection would involve cotrans¬ 
forming plants with two separate DNAs, one with the DAO gene and the 
other with the target gene (with the expectation that the cotransformation 
frequency will be on the order of 30% to 80%). Alternatively, the DAO gene 
flanked by Ds elements (Fig. 18.25) and a target gene may be inserted into 
the same T-DNA. In either case, the DAO gene and the target gene will be 
located far away from one another so that sexual crossing (breeding) may 
be used to segregate the target gene from the DAO gene. The initial trans¬ 
formants with both genes are selected following growth on D-alanine. 
Following sexual crossing, the segregated transformants that have lost the 
DAO gene are selected following growth on D-isoleucine. The selected 
transformants are then assayed for the presence of the target gene. A sig¬ 
nificant fraction of the plants that lose the DAO gene nevertheless retain the 
target gene. Despite the elegance of this negative/positive selection system, 
its efficacy in producing marker-free plants remains to be demonstrated. 

Removing Marker Genes from Chloroplast DNA 

The promise of high levels of foreign-gene expression, as well as the pos¬ 
sibility of expressing several related genes under the control of the same 
promoter, has resulted in an increasing number of foreign genes being 
introduced into plant chloroplast DNA. One way to remove marker genes 
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FIGURE 18.27 A possible positive-negative selection scheme for marker-free trans¬ 
formed plants. (A) Wild-type plants that lack D-amino acid oxidase (DAO) are 
inhibited by growth on D-alanine but not D-isoleucine. (B) DAO-transformed plants 
are inhibited by growth on D-isoleucine but not D-alanine. (C) Plants are cotrans¬ 
formed with one DNA carrying the DAO gene as a selectable marker and a separate 
DNA fragment with the target gene and then grown on medium containing D-ala¬ 
nine. Only transformants carrying the DAO gene will proliferate under these condi¬ 
tions, and most (>50%) of those transformants will also carry the target gene 
integrated at a separate site on the plant chromosome. Following breeding, plants 
that proliferate on medium containing D-isoleucine (because they no longer contain 
the DAO gene) are tested for the presence of the target gene. Adapted from Scheid, 
Nat. Biotechnol. 22:398-399, 2004. 





from chloroplast DNA is to introduce the foreign genes as part of a genetic 
construct that includes a selectable bacterial gene, e.g., aadA, which confers 
resistance to the antibiotics spectinomycin and streptomycin. This marker 
gene is flanked by directly repeating DNA sequences (in this case 174 bp); 
following cell growth in the absence of selective pressure, the selectable 
gene will be excised by homologous recombination between the 174-bp 
sequences (Fig. 18.28). 

An alternative approach is to develop selectable markers that do not 
encode antibiotic resistance or any other undesirable trait. One possible 
selectable marker of this type is the spinach gene for the enzyme betaine 
aldehyde dehydrogenase. This enzyme, which is present in the chloro- 
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174-bp DNA repeat 

FIGURE 18.28 Schematic representation of a selectable marker gene excision system 
for transformed chloroplasts. Following introduction into chloroplasts, the chloro- 
plast (Cp) DNA sequences on the plasmid direct the integration of the foreign gene 
into the chloroplast genome by homologous recombination. Transformants are 
selected by their resistance to spectinomycin. Then, transformants are grown 
without antibiotics, and the selectable marker gene is lost as a result of homologous 
recombination between the 174-bp repeat elements. The promoters, transcription 
termination regions, and ribosome-binding sites of both genes are not shown. 


plasts of a limited number of plants, converts the toxic betaine aldehyde to 
the nontoxic glycine betaine. In addition, glycine betaine acts as an osmo- 
protectant, conferring some measure of salt or drought resistance. When 
resistance to betaine aldehyde was used to select transgenic plants, the 
transformation efficiency was approximately 25-fold higher than when 
transformants were selected with spectinomycin. The use of this approach 
should facilitate the chloroplast transformation of many important crops, 
including cereals that are naturally resistant to spectinomycin. 


SUMMARY 


P lants are genetically engineered by (1) introducing a gene 
into plant cells that are growing in culture, (2) selecting 
transformed cells, and then (3) regenerating a fertile plant. 
Strains of the soil bacterium A. tumefaciens can genetically 
engineer plants naturally. In this system, after responding to 
chemical signals from a surface wound, A. tumefaciens makes 
contact with an exposed plant cell membrane. A series of steps 
then occurs that results in the transfer of a segment (T-DNA) 
of a plasmid (Ti plasmid) from the bacterium into the nucleus 
of the plant cell. The T-DNA region becomes integrated into 
the plant genome, and subsequently, the genes on the T-DNA 
region are expressed. The T-DNA region contains genes that 
encode enzymes for the production of phytohormones. These 
compounds cause the plant cells to enlarge and proliferate. 
Also, the plant cell becomes a factory for the production of an 
opine that is encoded by a T-DNA gene that can be catabolized 
only by A. tumefaciens with a specific Ti plasmid. Thus, A. 
tumefaciens has evolved a mechanism that converts a plant cell 
into a production center for a carbon and nitrogen source 
(opine) for its exclusive use. 

The A. tumefaciens-Ti plasmid system has been modified 
for use as a delivery mechanism for cloned genes to some 
plant cells. In these vector systems, the phytohormone and 
opine metabolism genes have been removed from the T-DNA 
region, and the modified T-DNA sequence has been cloned 


into a plasmid that can exist stably in E. coli. A cloned gene 
that is inserted into this T-DNA region is part of the DNA that 
is transferred into the nucleus of a recipient plant cell. To 
achieve this transfer, A. tumefaciens is used as a delivery 
system. In one system, the shuttle vector with the T-DNA- 
cloned gene segment is introduced into an A. tumefaciens 
strain that carries a compatible plasmid with genes that are 
essential for transferring a T-DNA region into a plant cell ( vir 
genes). In addition to this binary vector system, a cointegrate 
system has been designed so that, after the introduction of the 
shuttle vector carrying the target gene into A. tumefaciens, it 
recombines with the vir gene-containing, disarmed Ti plasmid 
to give a single plasmid that has both vir gene functions and 
the T-DNA-cloned gene segment. 

The A. tumefaciens T-DNA region has been used to produce 
a large number of transgenic plants. Unfortunately, this 
system is not effective with all plants. However, microprojec¬ 
tile bombardment (biolistics) has been an effective procedure 
for delivering DNA to a wider range of plant cells. Moreover, 
foreign DNA up to 150 kb in size that is cloned in YACs can be 
transferred to plant cells using a biolistic procedure. This 
transferred DNA can be stably integrated into the genome of 
the plant cells. 

Different plant promoters that are active only in specific 
plant tissues, or only at certain times during the life of the 
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plant, are identified by the acquisition of expression of a pro¬ 
moterless reporter gene after it has integrated into plant chro¬ 
mosomal DNA. Methods for the insertion of the foreign gene 
directly into either the chloroplast or mitochondrial DNA and 
protocols to make targeted changes to existing genes within 
the genomic DNA of plant cells have been developed. In addi¬ 
tion, techniques have been devised to decrease the amount of 
a specific mRNA in a plant cell and thereby downregulate the 
expression of that gene. To facilitate the purification of foreign 
proteins that are produced in plants, it is possible to fuse a 


target gene to an oleosin gene so that the fusion protein that is 
produced is localized in seed oil bodies. A technique called 
rhizosecretion can be employed to secrete the foreign protein 
along with other root exudates. To increase the usefulness of 
plants as production systems for mammalian proteins, it is 
possible to modify the glycosylation patterns of foreign pro¬ 
teins synthesized in plants to avoid problematic carbohydrate 
moieties. Finally, experimental protocols that remove marker 
genes from transgenic plants have been developed. 
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REVIEW QUESTIONS 


1. Why is the Ti plasmid from A. tumefaciens well suited for 
developing a vector to transfer foreign genes into plant chro¬ 
mosomal DNA? 

2. How do (1) binary and (2) cointegrate Ti plasmid-based 
vector systems for plant transformation differ from one 
another? 


3. What are reporter genes, and how are they used when 
plant cells are transformed? 

4. How are plants transformed by microprojectile bombard¬ 
ment? 

5. Describe how you would isolate a root-specific plant pro¬ 
moter. 
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6. How is foreign DNA targeted for integration into chloro- 
plast DNA? 

7. How would you produce a transgenic plant that does not 
contain a marker gene? 

8. How would you ensure that a foreign gene that has been 
inserted into the chloroplast is expressed at a high level? 

9. How can RNA-DNA chimeric molecules be used to intro¬ 
duce targeted alterations to plant genomic DNA? 

10. How would you downregulate the expression of a plant 
gene? 

11. What is the advantage of introducing foreign genes into 
chloroplast rather than nuclear DNA? 

12. What is rhizosecretion? Why is it useful? How can it be 
engineered? 

13. How can chromosomal marker genes be removed without 
using sexual crossing to segregate the selectable marker from 
the gene of interest? 


14. Describe a strategy that could be used to remove antibi¬ 
otic-resistant marker genes from chloroplast DNA. 

15. Suggest a strategy that would facilitate the large-scale 
purification of soluble proteins, such as antibody fragments, 
in plants. 

16. How do enhancer sequences facilitate plant gene expres¬ 
sion? 

17. How can oleosins be used to facilitate the purification of a 
target protein synthesized in a transgenic plant? 

18. How would you modify the glycosylation pattern of a 
mammalian protein produced in plants? 

19. How would you use the yeast DAO gene to select trans¬ 
genic plants that contain only the introduced target gene and 
no selectable marker gene? 



B. thuringiensis Protoxin 

Other Strategies for Protecting Plants 

against Insects 

Preventing the Development of 
B. thuringiensis -Resistant Insects 

Virus Resistance 

Viral Coat Protein-Mediated Protection 
Protection by Expression of Other 
Genes 

Herbicide Resistance 

Fungus and Bacterium Resistance 

Oxidative Stress 

Salt and Drought Stress 

Fruit Ripening and Flower Wilting 

SUMMARY 

REFERENCES 

REVIEW QUESTIONS 


Engineering Plants To 
Overcome Biotic and 
Abiotic Stress 


T he principal objective of plant biotechnology is to create new vari¬ 
eties of cultivated plants (cultivars). The majority of the initial studies 
of transgenic plants have focused on developing strains that give 
better yields. Genes that confer resistance to insects, viruses, herbicides, 
environmental stress, and senescence have been incorporated into various 
plants. A considerable amount of this work has been commercialized and 
has been the subject of much public scrutiny and discussion. Some of this 
work is discussed below. 


Insect Resistance 

The genetic engineering of crop plants to produce functional insecticides 
makes it possible to develop crops that are intrinsically resistant to insect 
predators and do not need to be sprayed (often six to eight times during a 
growing season) with costly and potentially hazardous chemical pesticides. 
It has been estimated that in 2007 the amount spent on chemical insecti¬ 
cides worldwide was approximately $15 billion to $20 billion. The cost of 
maintaining such genetically engineered insect-resistant crops is lower 
than that for nonresistant crops. Moreover, biological insecticides are usu¬ 
ally highly specific for a limited number of insect species, and they are 
generally considered to be nonhazardous to humans and other higher ani¬ 
mals. In addition, by reducing the damage to plants from insect predation, 
a corresponding decrease in the damage to plants from a number of fungal 
diseases should result, since many pathogenic fungi often invade a plant 
either together with or as a consequence of insect infection. 

Several different strategies have been used to confer resistance to insect 
predators. One approach involves a gene for an insecticidal protoxin pro¬ 
duced by one of several subspecies of the bacterium Bacillus thuringiensis 
(see chapter 16). Other common strategies use genes for plant proteins, 
such as a-amylase inhibitors, protease inhibitors, and lectins, that have 
been shown to be effective against a wide variety of insects. After an insect 
ingests one of these inhibitors, it is not able to digest food (i.e., plants) 
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because the inhibitor interferes with the hydrolysis of starch or plant pro¬ 
teins. Thus, the insect will feed less and eventually die. 

Increasing Expression of the 8. thuringiensis Protoxin 

B. thuringiensis protoxin does not persist in the environment, nor is it haz¬ 
ardous to mammals. Thus, it is a safe means of protecting plants. It is both 
simpler and less costly to express the genes for B. thuringiensis toxins in 
plants than to spray B. thuringiensis preparations onto the surface of the 
plant. This mode of insecticidal-toxin delivery limits the environmental 
distribution of the toxin and avoids problems associated with spraying B. 
thuringiensis preparations, such as limited environmental stability and the 
timing of the toxin application. 

The scientific challenge in utilizing the B. thuringiensis protoxin is to 
create a transgenic plant that expresses and synthesizes a functional form 
of this prokaryotic insecticide at sufficient levels to prevent damage by 
insect predation. In initial experiments, the B. thuringiensis subsp. kurstaki 
insecticidal-protein genes, crylAa, crylAb, and crylAc, were not particu¬ 
larly well expressed in plants (Table 19.1). This is problematic, because high 
levels of expression of these insect control proteins are needed in order to 
produce commercially viable insect-resistant plants. To raise the level of the 
expressed protein, scientists truncated the gene so that only the N-terminal 
portion of the insecticidal protoxin—the part of the protoxin that contains 
the toxin (see chapter 16)—was produced and inserted a strong plant pro¬ 
moter to direct gene expression. Under these conditions, there was a sig¬ 
nificant increase in the level of insecticidal toxin produced, affording 
transgenic plants some protection against damage from insect predation. 

The minimum sequence that encoded toxin activity had to be deter¬ 
mined. To this end, the amino acid sequences of protoxins from various 
strains of B. thuringiensis were compared to determine whether there is a 
common insecticidal (toxin) domain. This analysis showed that the 
N-terminal portion of the protoxin molecule is highly conserved (-98%) 
and the C-terminal region is more variable (-45% conserved). Further work 
showed that all of the insecticidal-toxin activity resides within the first 646 


TABLE 19.1 Expression of some B. thuringiensis insecticidal toxin genes in trans¬ 
genic plants 


Plant(s) 

Gene 

% Expression 

Insecticidal 

Tobacco 

crylAb, full 

0.0001-0.0005 

No 

Tobacco 

crylAb, truncated 

0.003-0.012 

Yes 

Tobacco 

crylAa, full 

Not detected 

No 

Tobacco 

crylAa, truncated 

0.00125 

Yes 

Tobacco 

crylAc, truncated 

<0.014 

Yes 

Tomato 

crylAb, truncated 

0.0001 

Yes 

Cotton 

crylAb, truncated, WT 

<0.002 

No 

Cotton 

crylAb, truncated, PM 

0.05-0.1 

Yes 

Tomato, tobacco 

crylAb, truncated, WT 

0.002 

Yes 

Tomato, tobacco 

crylAb, truncated, PM 

0.002-0.2 

Yes 

Tomato, tobacco 

crylAb, truncated, FM 

0.3 

Yes 


Adapted from Ely, p. 105-124, in Entwistle et al. (ed).. Bacillus thuringiensis, an Environmental 
Biopesticide: Theory and Practice (John Wiley & Sons, Chichester, United Kingdom, 1993). 

Terms and abbreviations: full, the complete protoxin gene; truncated, a shortened version of the pro¬ 
toxin gene; WT, wild-type codons; PM, partially modified codons; FM, fully modified codons. 
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FIGURE 19.1 Cointegrate cloning vector carrying a B. thuringiensis (Bt) insecticidal- 
toxin gene. The toxin gene is under the control of the strong, constitutive 35S pro¬ 
moter (P 35S ) from cauliflower mosaic virus and the nopaline synthase transcription 
terminator-polyadenylation site (fNOS). The vector has an E. coli origin of DNA 
replication (ori) and an Spc r gene, which allow the vector to be maintained and 
selected in E. coli cells; a T-DNA right border; a plant selectable marker gene; and a 
region of DNA that is homologous to DNA in the disarmed Ti plasmid, for inte¬ 
grating the two plasmids. The neomycin phosphotransferase gene (NPT), which 
acts as a plant reporter gene, is under the transcriptional control of nopaline syn¬ 
thase gene sequences (pNOS and fNOS) and is used to select for kanamycin-resis- 
tant transformed plant cells. 

amino adds from the N terminus of the 1,156-amino-acid protoxin. When 
the segment of the protoxin gene that encodes the highly conserved amino 
acid sequence was cloned and expressed in bacteria, the shortened protein 
was as active as the native (protoxin) form in protecting plants against 
lepidopteran insects in laboratory trials. 

Transgenic tomato plants with a truncated B. thuringiensis protoxin 
gene were produced to test whether the shortened protoxin would be able 
to protect the plants from damage by various insect pests. The shortened 
version of the protoxin gene was put under the transcriptional control of 
the strong constitutive 35S promoter from cauliflower mosaic virus and the 
nopaline synthase transcription termination-polyadenylation site and was 
then cloned into the T-DNA (transferred DNA) region of a cointegrate-type 
Ti plasmid vector (Fig. 19.1). The vector contained a spectinomycin resis¬ 
tance (Spc r ) gene that allowed it to be selected in either Escherichia coli or 
Agrobacterium tumefaciens, an E. coli origin of DNA replication, and a neo¬ 
mycin phosphotransferase gene that was under the control of the nopaline 
synthase promoter and transcription termination-polyadenylation sites 
and enabled the selection of transformed plant cells in the presence of 
kanamycin. In addition, the cointegrate cloning vector had the right border 
of the T-DNA from a nopaline Ti plasmid and a segment of the octopine Ti 
plasmid that provides a region of homology for cointegrate formation by 
homologous recombination with a disarmed Ti plasmid. The plasmid was 
constructed and manipulated in E. coli before it was transferred by conjuga¬ 
tion to a strain of A. tumefaciens that contained a disarmed Ti plasmid. After 
recombination in A. tumefaciens, the short form of the protoxin gene was 
transferred to the chromosomal DNA of tomato plants. 

In both greenhouse and field trials, transgenic tomato plants that 
expressed the short form of the protoxin were protected to some degree 
against damage caused by tobacco hornworms (Manduca sexta), tomato 


762 


CHAPTER 19 


fruitworms (Heliothis zea), and tomato pinworms (Keiferia lycopersicella). 
The extent of the protection was not the same for each of the insects, nor 
was it complete. The transgenic plants were protected to some extent from 
damage caused by tobacco hornworms and tomato fruitworms and to a 
lesser degree from damage by tomato pinworms. A combination of a low 
dose of chemical insecticide and production of the protoxin by the plants 
increased the level of protection afforded by the protoxin. 

In an effort to dramatically increase the level of expression, an isolated 
insecticidal-toxin gene was modified by site-directed mutagenesis to 
change any DNA sequences that could inhibit efficient transcription or 
translation in a plant host (Table 19.1). This "partially" modified gene had 
a nucleotide sequence that was 96.5% unchanged from that of the wild- 
type gene and encoded the identical insecticidal-toxin protein. Transgenic 
plants that expressed this partially modified sequence produced a 10-fold- 
higher level of insecticidal-toxin protein than did plants that were trans¬ 
formed with the wild-type gene. Subsequently, a "fully" modified version 
of the insecticidal-toxin gene was designed and chemically synthesized. 
This fully modified gene contained codons more commonly used by plants, 
as opposed to those favored by gram-positive bacteria, such as B. thuringi- 
ensis. This gene was also modified to eliminate any potential messenger 
RNA (mRNA) secondary structure or chance plant polyadenylation 
sequences that might decrease gene expression. After modification, it had 
a G+C content of 49% (the wild-type gene is 37% G+C) and a nucleotide 
sequence that was only 78.9% identical to that of the wild-type gene. 

Transgenic plants that were transformed with this highly modified 
synthetic protoxin gene had an approximately 100-fold-higher level of 
toxin protein than did plants transformed with the wild-type gene. 



MILESTONE 


Light-Inducible and Chloroplast-Associated Expression 
of a Chimaeric Gene Introduced into Nicotiana tabacum 
Using a Ti Plasmid Vector 

L. Herrera-Estrella, G. Van den Broeck, R. Maenhaut, 

M. Van Montagu, J. Schell, M. Timko, and A. Cashmore 
Nature 310:115-120, 1984 


A fter researchers had estab¬ 
lished the Ti plasmid system 
as an effective means of trans¬ 
forming many different plants, their 
attention turned to the development 
of procedures for the expression of 
foreign genes in plants. Initially, most 
of the genes that were introduced into 
plant cells were under the transcrip¬ 
tional control of either the relatively 
strong constitutive 35S promoter from 
cauliflower mosaic virus or the nearly 
as strong constitutive promoter for the 
nopaline synthase gene that is 
encoded within some T-DNAs. 
However, the development of plants 
with useful new and modified traits 


often requires that a specific protein 
be expressed only in certain tissues, 
e.g., leaves or roots, or only at certain 
times in the life of the plant, e.g., 
during early seedling development, 
fruit formation, or high-temperature 
stress. As a first step toward the devel¬ 
opment of plants that expressed for¬ 
eign genes in a tissue-specific or 
time-specific manner, Herrera-Estrella 
et al. constructed a chimeric gene that 
included the 5'-flanking region from 
the pea gene for the small subunit of 
ribulose bisphosphate carboxylase 
containing transcriptional regulatory 
sequences, the coding region of a bac¬ 
terial chloramphenicol acetyltrans- 


ferase gene as an easily selectable 
gene, and the 3'-flanking region from 
the nopaline synthase gene containing 
signals both for termination of tran¬ 
scription and for polyadenylation of 
the mRNA. Normally the gene for the 
small subunit of ribulose bisphosphate 
carboxylase is expressed only in green 
or photosynthetic tissue; as expected, 
the chloramphenicol acetyltransferase 
gene under the regulatory control of 
this DNA sequence was also expressed 
only in photosynthetic tissues. This 
work provided one of the first demon¬ 
strations that, despite their complexity, 
plant promoters could direct the tran¬ 
scription of heterologous proteins 
accurately and with tissue specificity. 
Since this study was done, researchers 
have used a wide range of plant pro¬ 
moters to direct tissue- and develop¬ 
ment-specific heterologous gene 
expression in transgenic plants. 








Engineering Plants To Overcome Biotic and Abiotic Stress 


763 


Chloroplast genome 


Intergenic region 

rbcL gene / accD gene 




aadA gene crylAal gene N 





83=T- 3 


rrn 

promoter 


TT 


FIGURE 19.2 Site on the chloroplast genome where a foreign gene encoding the B. 
thuringiensis Cry2Aa2 protoxin is integrated by homologous recombination. The 
genes rbcL and accD are both present in a single copy per chloroplast genome. The 
intergenic region between these two genes, which is the site of insertion of the for¬ 
eign genes, is smaller than it appears in this representation. The aadA gene 
(encoding spectinomycin and streptomycin resistances) and the crylAal gene are 
both under the transcriptional control of the constitutive chloroplast rrn promoter 
and transcription terminator (TT), and each contains its own ribosomal binding site. 
Integration of foreign DNA into the intergenic spacer region prevents insertion of a 
foreign gene from interfering with the expression of any endogenous chloroplast 
genes. Adapted from Kota et al., Proc. Natl. Acad. Sci. USA 96:1840-1845,1999. 


Moreover, this higher level of insecticidal-toxin synthesis was directly cor¬ 
related with increased insecticidal activity. 

In another approach to increasing the expression of the protoxin, one 
group of researchers expressed the fully modified protoxin gene under the 
control of the promoter for the gene that codes for the small subunit of the 
plant enzyme ribulose bisphosphate carboxylase and downstream from 
the chloroplast transit peptide sequence of this enzyme, so that the over¬ 
produced protoxin became localized within the chloroplast. This strategy 
led to a very high level of expression (nearly 1% of the total leaf protein) 
of the insecticidal protoxin. Other researchers have introduced an insecti- 
cidal-protoxin gene directly into the chloroplast DNA of the host plant. 
The B. thuringiensis protoxin gene was integrated into a specific site on the 
chloroplast DNA by constructing a vector that contained the protoxin gene 
flanked by two single-copy chloroplast genes (Fig. 19.2). Integration of the 
introduced genes occurs by homologous recombination. Once integrated 
into the chloroplast DNA, a protoxin gene under the transcriptional con¬ 
trol of a strong chloroplast promoter may be expressed at high levels, so 
the protoxin may compose as much as 2 to 3% of the total soluble protein 
in the leaf, yielding a very high level of insecticidal activity. In addition, 
even this level of foreign-protein expression could be dramatically 
increased (by 10- to 20-fold) by coexpressing (as part of the same operon 
that was introduced into the chloroplast DNA) a B. thuringiensis gene that 
encodes a chaperonin protein that facilitates the correct folding of the 
insecticidal-protein protoxin. 

Integration of the B. thuringiensis protoxin gene into chloroplast DNA 
has a number of potential advantages over inserting it into the chromosomal 
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FIGURE 19.3 Binary cloning vector car¬ 
rying a cowpea trypsin inhibitor gene. 
The vector contains a broad-host-range 
origin of DNA replication (ori) and a 
kanamycin resistance (Kan r ) gene, 
which function in both E. coli and A. 
tumefaciens. Between the T-DNA left 
and right borders, there are (1) a neo¬ 
mycin phosphotransferase gene (NPT) 
under the transcriptional control of 
nopaline synthase signals (pNOS and 
fNOS), which enables kanamycin-resis- 
tant transformed plant cells to be 
selected, and (2) the cowpea trypsin 
inhibitor gene, which is under the con¬ 
trol of the 35S promoter (P 35S ) from 
cauliflower mosaic virus and the tran¬ 
scription terminator-polyadenylation 
region from the nopaline synthase gene 
(fNOS). 


Cowpea trypsin 
inhibitor gene 



DNA. First, the protoxin gene does not have to be modified, because the 
chloroplast transcriptional and translational apparatuses are typically 
prokaryotic. Second, because there are many chloroplasts per cell and many 
copies of chloroplast DNA per chloroplast, the protoxin gene is present in 
multiple copies and therefore is more likely to be expressed at a high level. 
Third, in most plants, chloroplasts are transmitted only through the egg and 
not through pollen, which means that plants receive all of their chloroplast 
DNA from their female parent. Consequently, there is no risk of unwanted 
transfer of the protoxin gene to other plants in the environment by pollen. 
The disadvantage of expressing the B. thuringiensis protoxin in chloroplasts 
is that insects that attack stems or fruit will not encounter the protoxin, since 
these tissues do not have any chloroplasts. 

To date, some form of the gene for the protoxin has been introduced 
and expressed in a wide variety of plant species, including alfalfa, apple, 
broccoli, cabbage, canola, com (maize), cotton, cranberry, eggplant, grape, 
hawthorn, juneberry, peanut, pear, poplar, potato, rice, rutabaga, soybean, 
spruce, sugar cane, tobacco, tomato, walnut, white clover, and white 
spruce. Following several seasons of successful field trials, these transgenic 
plants were approved for commercial release in the United States, Canada, 
and Argentina, and large-scale growth of the plants in the field began in 
1996. Although insect populations still have to be monitored to keep track 
of the frequency of resistant organisms, the use of crops expressing B. thu¬ 
ringiensis insecticidal proteins has already exceeded the length of time that 
it typically takes for resistance to arise in insects to conventional pesticides. 
A number of transgenic plants that express an insecticidal toxin or protoxin 
are currently being used commercially—it is estimated that worldwide, in 
2007, farmers planted approximately 40 million hectares of transgenic B. 
thuringiensis insecticidal-protein-containing crops. This technology has 
more than lived up to the hopes and expectations of scientists. 
Notwithstanding the initial concerns about the technology, especially in 
Europe, this approach to crop protection has gained widespread accep¬ 
tance throughout much of the world. 

Other Strategies for Protecting Plants against Insects 

No single B. thuringiensis protoxin is effective against a broad range of 
insect species. This may limit the overall usefulness of these protoxins. 
However, plants have evolved general insect defense mechanisms that are 
sufficient for plant survival but not always effective enough to keep the 
damage to a level that would be acceptable for crop plants. For example, 
some plants produce protease inhibitors that, when ingested, prevent the 
feeding insect from hydrolyzing plant proteins, thereby effectively starving 
the predator insect. Consequently, it seemed reasonable to isolate a plant 
gene for a protease inhibitor, add a strong promoter, and create transgenic 
crop plants that produce sufficiently high levels of the protease inhibitor to 
reduce damage from insect predation. 

Protease inhibitors. In one study, researchers isolated a clone that encodes 
cowpea trypsin inhibitor from a complementary DNA (cDNA) clone bank 
by using a chemically synthesized DNA probe based on the amino acid 
sequence of the cowpea trypsin inhibitor protein. The full-length cDNA 
was subcloned onto a Ti plasmid binary cloning vector (Fig. 19.3) and intro¬ 
duced into a strain of A. tumefaciens carrying a disarmed Ti plasmid that 
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Pin2 5' end Actl intron Pin2 gene Pin2 3' end P 3 5 S 5' end bar gene nos 3' end 



FIGURE 19.4 Plasmid vector carrying the potato proteinase inhibitor II gene (Pin2). 5' 
end, the region of DNA preceding the gene; 3' end, the region of DNA following the 
gene; Actl intron, the first intron from the rice actin 1 gene; P 35S 5' end, the 35S pro¬ 
moter from cauliflower mosaic virus; bar gene, the bacterial phosphinothricin 
acetyltransferase gene; nos 3' end, the region of DNA following the nopaline syn¬ 
thase gene. The bar gene serves as a selectable marker for transgenic plants, confer¬ 
ring resistance to the herbicide Basta (ammonium glufosinate). 


contained active vir genes. Following A. tumefaciens infection of tobacco 
leaf disks with this vector, cells that incorporated the cloned DNA were 
selected for growth on kanamycin, and transgenic plants were regenerated. 
The damage caused by Heliothis virescens (tobacco budworm) larvae to 
transgenic plants that expressed more than 2 mg of cowpea trypsin inhib¬ 
itor per mg of protein was significantly less than the damage inflicted on 
non transformed plants. 

Cowpea seeds that contain approximately 2 mg of inhibitor per mg of 
plant protein are not toxic to either animals or humans. However, if the 
amount of protease inhibitor produced by a transgenic plant is determined 
to be a potential hazard, then it is possible to limit the expression of the 
protease inhibitor to the plant tissues that the major insect pests prefer but 
that are not used as food by humans or animals. In other words, a cloned 
protease inhibitor gene could be active in the leaves and roots of a plant but 
not in the commercially valuable fruit. 

Introduction of the potato proteinase inhibitor II gene provides rice 
plants with protection against the pink stem borer (Sesamia inferens), a 
major insect pest of rice. Infestation of rice plants by pink stem borers 
causes severe damage to the plants, often resulting in a hollow stem and 
dead panicles with no seeds. A plasmid carrying the potato proteinase 
inhibitor II gene under the control of its own promoter and transcription 
termination region was constructed. The plasmid also contained the first 
intron from the rice actin gene inserted between the promoter and the 
potato proteinase inhibitor II coding region (Fig. 19.4). This construct was 
introduced into rice suspension cells by microprojectile bombardment, and 
transgenic plants were generated. When pink stem borer larvae were arti¬ 
ficially applied, 70 to 100% of the wild-type plants were severely damaged 
by insect predation, while only 15 to 20% of the transgenic plants were 
damaged. Since plant proteinase inhibitors are common components of 
both human and animal food and are readily inactivated by cooking, their 
introduction into new crops can be regarded as safe. 

Another strategy that is designed to increase the effectiveness of rela¬ 
tively low levels of B. thuringiensis insecticidal-toxin activity entails com¬ 
bining the toxin with a serine protease inhibitor. In laboratory trials, 
investigators found that when the amount of purified B. thuringiensis insec¬ 
ticidal toxin that causes minimal insect mortality was mixed with a low 
concentration of protease inhibitor, the insecticidal activity of the mixture 
was 20-fold greater than that of the B. thuringiensis protoxin alone. To test 
whether this scheme would function in transgenic plants, a DNA fragment 
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that encoded a fusion protein consisting of both a protease inhibitor and a 
truncated toxin was constructed. Transgenic tobacco plants that produced 
small amounts of this fusion protein were protected from insect attack. 

(/-Amylase inhibitor. Another way of imparting insect resistance to sus¬ 
ceptible plants entails using a gene that encodes an a-amylase inhibitor. 
The cowpea weevil (Callosobruclms maculatus) and the azuki bean weevil 
(Callosobruchus chinensis) are seed-feeding beetles that cause considerable 
economic loss of these legume crops, especially in developing countries. 
When larvae of these insects are fed a diet that includes the common bean 
(Phaseolus vulgaris), insect growth is inhibited. This growth inhibition is 
attributable to the presence of an a-amylase inhibitor in the seed proteins 
of the common bean. Accordingly, the gene for the a-amylase inhibitor 
from the common bean was isolated, placed under the transcriptional con¬ 
trol of the strong seed-specific promoter for the bean phytohemagglutinin 
gene, and used to transform pea plants (Pisum sativum). Peas are usually 
quite susceptible to damage by both cowpea weevils and azuki bean wee¬ 
vils. However, transgenic pea plants that expressed the a-amylase inhibitor 
were resistant to both of these insects. The level of resistance to cowpea 
weevils was found to be proportional to the amount of a-amylase inhibitor 
that the transgenic plant produced (Fig. 19.5). 

Cholesterol oxidase. Another approach to developing insect-resistant 
transgenic plants makes use of a bacterial cholesterol oxidase gene. 
Cholesterol oxidase, which is present in a range of different bacterial genera, 
catalyzes the oxidation of 3-hydroxysteroids to ketosteroids and hydrogen 
peroxide. This enzyme is commonly used in assays to determine the levels 
of cholesterol in human serum. Low levels of the enzyme have a high level 
of insecticidal activity against larvae of the boll weevil ( Anthonomus grandis 
grandis) (Fig. 19.6), a common and economically important insect (Coleoptera) 
pest of cotton, and have lower levels of activity against some lepidopteran 
pests. Cholesterol oxidase probably acts by disrupting the insect's midgut 
epithelial membrane, thus killing the insect. A cholesterol oxidase gene 
encoding a protein with a molecular mass of approximately 55,000 daltons 


FIGURE 19.5 Mortality of cowpea weevil larvae reared on transgenic pea plants that 
produce different amounts of a-amylase inhibitor. 
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with a length of 504 amino acids plus a leader peptide of approximately 
5,000 daltons (43 amino acids) was isolated from a strain of Streptomyces and 
cloned into a vector under the control of a plant virus (figwort mosaic virus) 
promoter and a termination sequence from the 3' region of the A. tumefaciens 
nopaline synthase gene. When this construct was introduced into tobacco 
cell protoplasts, the transformed cells actively expressed the cholesterol 
oxidase. When the gene is introduced into cotton plants on a commercial 
scale, either by itself or in combination with genes for other biological insec¬ 
ticides, it should be an effective means of protecting plants against damage 
from insect predation. 

Vegetative insecticidal toxins. In addition to the well-characterized Cry 
insecticidal toxins—over 350 of which have been identified— B. thuringi- 
ensis produces a secreted insecticidal protein during its vegetative growth 
stage. To date, two major groups of vegetative insecticidal proteins (Vip) 
have been identified. One group consists of the proteins Vipl and Vip2, 
which are not toxic to lepidoptera, and Vip3, which targets several major 
lepidopteran pests. The less-well-characterized Vip proteins may act syner- 
gistically with Cry proteins to kill their target insects, providing a double- 
barreled approach to insect toxicity, so that it is extremely difficult for 
susceptible insects to develop resistance. It would therefore be advanta¬ 
geous if transgenic plants expressing both Cry and Vip proteins could be 
created. As a first step, researchers shuffled the two major domains of two 
Vip3 proteins, Vip3Acl and VipAal (Fig. 19.7). One of the hybrid Vip3 pro¬ 
teins (i.e., Vip3AcAa) displayed the highest activity of the four proteins 
against fall armyworms, cotton bollworms, and silkworms. Moreover, only 
the Vip3AcAa construct was toxic to a strain of cabbage looper that was 
resistant to the well-characterized B. thuringiensis insecticidal protein 
CrylAc. The chimeric toxin Vip3AcAa enriches the diversity of Vip toxins 
that can be used together with conventional Cry proteins to generate trans¬ 
genic plants that are highly unlikely to select for resistant insects. 

Other proteins. The activities of several other proteins have been utilized 
in an effort to protect plants from insect predation. For example, some 

FIGURE 19.6 Effect of increasing amounts of cholesterol oxidase on the mortality of 
boll weevil larvae, ppm, parts per million. Adapted from Corbin et al., Appl. 
Environ. Microbiol. 60:4239M244,1994. 
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FIGURE 19.7 Generation of two hybrid insecticidal proteins constructed by domain 
shuffling. The approximate sizes of the amino- and carboxy-terminal fragments are 
indicated. The hybrid protein Vip3AcAa was found to have the greatest level of 
insecticidal activity with the insect larvae tested. 


lectins, i.e., carbohydrate-binding proteins found in the seeds and storage 
tissues of a variety of plant species, are toxic to certain species of insects. 
While many plant lectins are toxic to mammals as well as insects, the lectin 
from the snowdrop plant (Galanthus nivalis) is toxic only to insects. With 
this in mind, the snowdrop lectin gene has been introduced into approxi¬ 
mately a dozen different plants, with the result that plants that expressed 
this protein were damaged by aphids to a lesser extent than nontrans- 
formed plants. However, although the snowdrop lectin significantly low¬ 
ered the amount of leaf material eaten, insect mortality was only slightly 
increased. 

When the gene encoding the enzyme tryptophan decarboxylase from 
periwinkle (Catharanthus roseus) is expressed in tobacco, the plants are 
protected from damage by the whitefly (Bemisia tabaci). While the precise 
mechanism of this protection is unknown, it has been suggested that the 
tryptamine that is produced by this enzyme, following the decarboxyla¬ 
tion of tryptophan, is used in the production of insect-inhibiting plant 
alkaloids. 

The gram-negative bacterium Pkotorhabdus luminescens produces a 
283-kilodalton protein, toxin A, that is highly toxic to a variety of insects. 
When this protein was expressed in transgenic Ambidopsis thaliana (Box 
19.1) plants in amounts of >700 ng/mg of extractable plant protein, it was 
found to be highly toxic to the tobacco homworm, as well as the southern 
com rootworm, with insect mortality typically 100%. 

Finally, transgenic com plants that express avidin—a glycoprotein iso¬ 
lated from chicken eggs that binds the coenzyme biotin with extremely 
high affinity—caused biotin deficiency that led to stunted growth and 
death in a number of different insect species. Importantly, the levels of 
avidin that are toxic to insects are not toxic to mice, suggesting that pro¬ 
tecting plants with an avidin transgene is not necessarily a problem for 
humans. 

RNA interference. The ingestion or microinjection of double-stranded 
RNA into some worms and insects has been used to silence genes in these 
organisms. This gene silencing works through the generation of RNA inter¬ 
ference (RNAi) (see chapter 11). In one study, when 290 different double- 
stranded RNAs thought to encode essential or important functions were 
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fed (one at a time) at a low level to larvae of the western com rootworm 
(.Diabrotica virgifera LeConte), growth was significantly inhibited with 67 of 
the RNAs. Fourteen of the double-stranded RNAs had 50% lethal concen¬ 
trations of <5.2 ng/cm 2 . Based on these results, transgenic com expressing 
one of the double-stranded RNAs, targeting the transcript for vacuolar 
ATPase, was constructed (Fig. 19.8). The transgenic plants were protected 
against the western com rootworm to an extent comparable to the protec¬ 
tion afforded by a B. thuringiensis transgene. The demonstration that it is 
possible to produce RNAi in coleoptera following oral delivery of double- 
stranded RNA is an important first step in the development of a completely 
novel approach to developing a wide range of insect-resistant plants, 
including resistance to some insects that are refractory to the B. thuringi¬ 
ensis insecticidal toxin. 

In a variation on the above-mentioned strategy, another group of 
researchers first identified the mechanism that the cotton bollworm uses to 
protect itself against the compound gossypol, which is produced by cotton 
plants to prevent insect predation. Gossypol is a yellow polyphenolic alde¬ 
hyde that permeates cells and acts as an inhibitor of several of the insect's 
dehydrogenase enzymes. It has been used as a male oral contraceptive in 
China, possesses antimalarial properties, and may have anticancer proper¬ 
ties. The cotton bollworm protects itself from the toxic effects of gossypol 
by inactivating the gossypol with the enzyme cytochrome P450 monooxy¬ 
genase. Thus, transgenic plants were constructed to synthesize an RNAi 
molecule that would silence the insect's gene for the cytochrome P450 
monooxygenase. By preventing the expression of cytochrome P450 monoox¬ 
ygenase, the insect was exposed to the full toxic effects of the plant-pro¬ 
duced gossypol, so that it was either killed or at least debilitated, and the 
damage to the plant was limited (Fig. 19.9). Plants produce a myriad of 
allelochemicals to protect themselves against insects, and many insects 
have developed strategies to overcome the toxic effects of these com¬ 
pounds. Therefore, the mechanisms utilized by insects to overcome the 
toxicity of the plant-produced compounds are attractive targets for devel¬ 
oping insect-resistant plants in the future. 


BOX 19.1 


Arabidopsis thaliana 

A rabidopsis thaliana (thale cress) is a 
small weed in the same family 
(Brassicaceae) as canola, mustard, and 
broccoli. It is native to Europe, Asia, 
and northwestern Africa. A. thaliana is 
popular with scientists as a model 
organism in plant biology and genetic 
studies. It has one of the smallest 
genomes—at 7 x 10 7 bp, it is similar to 
the size of the yeast genome, which is 
approximately 1.5 x 10 7 bp—of any 
flowering plant, which makes it rea¬ 
dily amenable to molecular genetic 
studies. The small size of its genome 
has made A. thaliana useful for the 


generation and selection of mutants, 
and it was the first plant genome to be 
sequenced, in 2000. 

The plant's small size and short life 
cycle are also advantageous for 
research. Laboratory strains of A. thal¬ 
iana take about 6 weeks from germina¬ 
tion to mature seed. The small size of 
the plant is convenient for cultivation 
in limited space, and it produces 
many seeds—an individual plant can 
produce several thousand seeds. 

Plant transformation in Arabidopsis 
is straightforward and has become a 
routine procedure in many laborato¬ 
ries, using A. tumefaciens to transfer 
DNA to the plant genome. The current 


Arabidopsis transformation protocol, 
termed "floral dip," involves dipping 
a flower into a solution containing 
Agrobacterium, the DNA of interest, 
and a detergent. This method avoids 
the need for tissue culture or plant 
regeneration. The idea is that some of 
the Agrobacterium cells will transfer 
their T-DNA containing the target 
DNA into the reproductive tissue of 
the plant. As a consequence of the 
above-mentioned traits and the rela¬ 
tive complexity of most other plants, 
Arabidopsis has become the E. coli of 
the plant world. 
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FIGURE 19.8 Use of RNAi to protect plants against insect predation. The double- 
stranded RNAs (dsRNAs) were produced using a commercial in vitro transcription 
system developed for this purpose. A number of double-stranded RNAs from 
essential insect genes were tested for the ability to elicit RNAi and inhibit insect 
larval proliferation. One of the most effective double-stranded RNAs, which 
encodes a portion of an ATPase gene, was spliced into a Ti plasmid vector and used 
to transform corn plants. The transformants with the highest levels of resistance to 
the western corn rootworm were selected. UTR, untranslated region. 


Preventing the Development of 8. thuringiensis -Resistant Insects 

There is little doubt that insects have the genetic potential to develop resis¬ 
tance to B. thuringiensis insecticidal toxins, and the more that B. thuringi¬ 
ensis insecticidal toxins are used, the greater the likelihood that populations 
of target insects will accumulate resistant individuals. Experimental strate¬ 
gies have been devised to prevent transgenic plants that express the B. 
thuringiensis protoxin gene from acting as selection agents for resistant 
insects. In one approach, the expression of the insecticidal toxin in trans¬ 
genic plants was limited to a short period. The gene for the B. thuringiensis 
protoxin was cloned downstream of the promoter of a gene from tobacco 
called the pathogenesis-related protein la (PR-la) gene. The expression of 
the PR-la gene is part of a natural defense mechanism that combats patho¬ 
gens. The PR-la gene is normally induced by any one of a variety of patho¬ 
genic organisms or by chemicals, such as salicylic acid and polyacrylic acid. 
When transgenic plants with the B. thuringiensis protoxin gene under the 
control of the PR-la promoter were treated with a chemical inducer, they 
synthesized detectable levels of insecticidal toxin within 1 day of applica¬ 
tion, which protected the plants against insect attack. Therefore, it is con¬ 
ceivable that the protoxin could be induced by the administration of an 
inexpensive and safe chemical inducer only when it is required during the 
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growing season, e.g., when insect larvae are feeding. Such periodic produc¬ 
tion should lower the selection pressure for resistant insects. 

One approach that might increase the insecticidal effectiveness of B. 
thuringiensis expressed in transgenic plants and also decrease the develop¬ 
ment of insect resistance is to fuse the bacterial insecticidal gene with 
another protein that increases the binding of the insecticide to the target 
intestinal cellular receptor. With this in mind, a fusion protein consisting of 
an N-terminal B. thuringiensis insecticidal toxin and a C-terminal peptide 
consisting of the nontoxic B-chain of the protein ricin was constructed. 
Ricin is a protein toxin that is extracted from castor beans. It consists of an 
A-chain of 267 amino acids that contains the toxin activity and a B-chain of 
262 amino acids that is catalytically inactive but serves to mediate entry of 
the complex into the cytosol. The B. thuringiensis insecticidal toxin binds to 
a receptor located within the membrane of the insect midgut (Fig. 19.10). 
Normally, since each insecticidal toxin interacts with a single receptor, the 
loss or modification of the receptor leads to resistance to the insecticidal 
toxin. However, since the ricin B-chain binds with very high affinity to 
N-acetyIgalactosaminc residues (which are adjacent to the B. thuringiensis 
insecticidal-toxin receptor), the fusion protein has two separate and inde¬ 
pendent means by which it is targeted to the receptor. With this fusion 
protein, it becomes extremely unlikely that both targeting mechanisms will 
cease to be effective at the same time. It has been suggested that this 
approach may be most effective in field situations where it is difficult or 


FIGURE 19.9 Use of RNAi to inhibit the synthesis of a P450 monooxygenase enzyme 
that inactivates the plant secondary metabolite gossypol. (A) In wild-type plants, 
the P450 monooxygenase inactivates the gossypol, the plant is defenseless, and the 
insect can severely damage the plant. (B) In transgenic plants that produce an RNAi 
that directs the degradation of the P450 monooxygenase mRNA, the gossypol syn¬ 
thesized by the plant is able to prevent the insect from severely damaging the 
plant. 
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impossible to implement a spatial-refuge (refugium) strategy (such as for 
transgenic rice). 

A number of additional strategies designed to prevent the develop¬ 
ment of insects that are resistant to the B. thuringiensis insecticidal toxin 
have been devised. They include the following. 

• Using spatial-refuge strategies. In this approach, a certain fraction of 
each farmer's land, generally around 20%, is planted with a non- 
transgenic crop, while the remainder of the land is planted with a 
transgenic version of the same crop expressing a high level of the B. 
thuringiensis insecticidal toxin. The idea behind this strategy is that 
the very small number of insects that are able to survive on the trans¬ 
genic insecticidal crop—a high dose of toxin kills 99.9% of suscep¬ 
tible insects—will mate with the much larger number of toxin-sensitive 
insects from the non transgenic crop. Thus, the gene for resistance is 
effectively diluted—a high dose of the toxin kills 99% of the heterozy¬ 
gotes—and the pest population remains sensitive to the insecticide. 

• Using two or more different B. thuringiensis insecticidal toxins 
(sometimes called gene stacking) or fusing portions of the active 
regions of two different toxin genes to generate novel hybrid protein 
insecticidal toxins (Box 19.2; also see chapter 16). This approach 
assumes that resistance to two control methods is much less likely to 
develop simultaneously. This approach has been found to be effec¬ 
tive in the field when it is combined with spatial refugia. 

• Transforming plants with both a B. thuringiensis insecticidal-toxin 
gene and another form of biological insecticide (e.g., an a-amylase 
inhibitor gene). This also assumes that resistance to two control 
methods is much less likely to develop simultaneously. 

• Spraying low levels of chemical insecticides at the same time that 
transgenic plants expressing a B. thuringiensis insecticidal-toxin gene 
are used. This also assumes that resistance to two control methods is 
much less likely to develop simultaneously. 

The insect resistance management strategies that have been used up to now 
appear to have been successful. For example, in one large study, researchers 
monitored the level of resistance of the pink bollworm (Pectinophora gos- 
sypiella) to B. thuringiensis in cotton fields over the course of 8 years, from 


FIGURE 19.10 Schematic representation of a hybrid protein consisting of the CrylAc 
insecticidal-toxin protein (at the N terminus), the B-chain of ricin (at the C ter¬ 
minus), and an insect midgut insecticidal-protein receptor. The Cry protein recog¬ 
nizes and binds to the insect midgut receptor, while the B-chain of ricin acts as an 
N-acetygalactosamine-specific lectin that binds very tightly to these residues, which 
are located adjacent to the receptor. Bt, B. thuringiensis. 
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BOX 19.2 


Managing Insect Resistance 
to B. thuringiensis by Gene 
Stacking 

A lthough various strains of B. thur¬ 
ingiensis produce hundreds of dif¬ 
ferent bacterial insecticidal proteins, 
transgenic field crops engineered to 
produce these proteins have utilized 
only a few types of insecticidal toxin. 
This has led to concerns that the 
expression of a single toxin 
throughout the growing season could 
result in insects evolving resistance to 
the toxin's effects. One of the strate¬ 
gies that has been employed to pre¬ 
vent or delay the development of 
insect resistance to B. thuringiensis 
insecticidal toxins includes trans¬ 


forming plants with two unrelated B. 
thuringiensis insecticidal-toxin genes. 
Recent data indicate that gene 
stacking (also called gene pyramiding) 
of two genes encoding proteins with 
different modes of action significantly 
delays the development of insect resis¬ 
tance to these insecticidal toxins. 

Bollgard II is a strain of genetically 
engineered cotton that was developed 
by the Monsanto Corporation to pro¬ 
duce both the Cry2Ab2 and CrylAc 
insecticidal proteins. This strain was 
produced by retransformation of the 
previously commercialized Bollgard 
cotton, which produces only the 
CrylAc insecticidal protein. The com¬ 
mercial use of Bollgard cotton began 
in 1996, and by 2003, it had been 


grown globally on more than 32 mil¬ 
lion acres with the benefits of reduced 
insecticide use, improved control of 
target insect pests, increased yield, 
and reduced production costs accruing 
to farmers. The two insecticidal pro¬ 
teins produced by Bollgard II provide 
protection against several major lepi- 
dopteran pests of cotton, including the 
cotton bollworm, tobacco budworm, 
pink bollworm, and armyworm. In 
addition to an expanded insecticidal 
range, Bollgard II is expected to signi¬ 
ficantly delay (or prevent) the deve¬ 
lopment of insect resistance in the 
field. Nevertheless, the use of Bollgard 
II requires the concomitant employ¬ 
ment of refugia. 


1997 to 2004. They found that the frequency of resistance did not change 
over this period and attributed this result to the use of refugia, the recessive 
inheritance of the resistance, the fact that the resistance that developed was 
incomplete, and the fitness costs associated with the development of insect 
resistance. It is nevertheless essential that new strategies to prevent the 
development of insect resistance continue to be developed. The amount of 
land that is devoted to the growth of transgenic crops that express a B. 
thuringiensis insecticidal toxin continues to increase rapidly worldwide. 
The types of crops that have been engineered to express a B. thuringiensis 
insecticidal toxin are also continuing to expand. Thus, a responsible 
approach dictates that we continue to be careful and vigilant in avoiding 
the development of insect resistance so as not to waste this resource. 


Virus Resistance 

Plant viruses often cause considerable crop damage and significantly 
reduce yields. Therefore, in the absence of effective chemical treatments, 
plant breeders have attempted to transfer naturally occurring virus resis¬ 
tance genes from one plant strain (cultivar) to another. However, resistant 
cultivars often revert to virus sensitivity, and resistance to one virus does 
not necessarily confer resistance to other, similar viruses. Natural virus 
resistance can be achieved in different ways: viral transmission can be 
blocked, establishment of the virus can be prevented, or viral symptoms 
can be bypassed or resisted. Genetic engineering has been used to develop 
nonconventional types of virus-resistant transgenic plants. 

Viral Coat Protein-Mediated Protection 

When transgenic plants express the gene for a coat protein (which usually 
is the most abundant protein of a virus particle) of a virus that normally 
infects those plants, the ability of the virus to subsequently infect the plants 
and spread systemically is often greatly diminished. For a long time, the 
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precise mechanism by which the presence of coat protein genes inhibits 
viral proliferation was not understood; however, it is now thought that it 
likely works through the generation of RNAi. Moreover, the antiviral effect 
occurs early in the viral replication cycle and, as a result, prevents any sig¬ 
nificant amount of viral synthesis. This feature is an advantage because it 
decreases the probability of selecting for spontaneous viral mutants that 
can overcome this resistance and replicate in the presence of viral coat pro¬ 
tein. The viral coat protein gene approach has been used to confer tolerance 
for a number of different plant viruses (Table 19.2). With this approach, 
researchers have developed virus-resistant transgenic plants for a number 
of different crops. Although complete protection is not usually achieved, 
high levels of virus resistance have been reported. In addition, a coat pro¬ 
tein gene from one virus sometimes provides tolerance for a broad spec¬ 
trum of unrelated viruses. The utility of this strategy is supported by the 
observation that transgenic plants that encode viral coat proteins do as well 
in field trials as in the laboratory studies. 

In both eukaryotes and prokaryotes, an RNA molecule that is comple¬ 
mentary to a normal gene transcript (mRNA) is called antisense RNA. The 
mRNA, being translatable, is considered to be a sense RNA. The presence 
of antisense RNA can decrease the synthesis of the gene product by forming 
a duplex molecule with the normal sense mRNA, thereby preventing it 
from being translated. The antisense RNA-mRNA duplex is also rapidly 
degraded, a response that diminishes the amount of that particular mRNA 
in the cell. Theoretically, it should be possible to prevent plant viruses from 


TABLE 19.2 Some transgenic plants engineered to have viral coat 
protein-mediated protection against viral infection 


Viral source of coat protein 

Transgenic plant(s) 

Alfalfa mosaic virus 

Alfalfa, tobacco, tomato 

Arabis mosaic virus 

Tobacco 

Beet necrotic yellow vein virus 

Sugar beet 

Cucumber mosaic virus 

Cucumber, tobacco 

Cymbidium ringspot virus 

Tobacco 

Grapevine chrome mosaic virus 

Tobacco 

Maize dwarf mosaic virus 

Sweet corn 

Papaya ringspot virus 

Papaya, tobacco 

Plum pox virus 

Tobacco 

Potato aucuba mosaic virus 

Tobacco 

Potato leafroll virus 

Potato 

Potato virus S 

Potato 

Potato virus X 

Potato, tobacco 

Potato virus Y 

Potato, tobacco 

Rice stripe virus 

Rice 

Soybean mosaic virus 

Tobacco 

Tobacco etch virus 

Tobacco 

Tobacco mosaic virus 

Tobacco, tomato 

Tomato mosaic virus 

Tomato 

Tomato rattle virus 

Tobacco 

Tomato streak virus 

Tobacco 

Tomato spotted wilt virus 

Tobacco 

Watermelon mosaic virus 2 

Tobacco 

Zucchini yellow mosaic virus 

Muskmelon, tobacco 
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FIGURE 19.11 Procedure for introducing CuMV coat protein cDNA into plant cells. 
RNA4, which encodes the coat protein, is isolated from a viral RNA preparation 
and used as the template for the synthesis of double-stranded cDNA. Linkers are 
added to the cDNA preparation, and the cDNAs are cloned into an E. coli plasmid 
vector. A full-length cDNA clone is identified, excised from the E. coli vector, and 
subcloned into a Ti plasmid cloning vector between the 35S promoter from cauli¬ 
flower mosaic virus (P 35S ) and the transcription terminator from the gene for the 
small subunit of ribulose bisphosphate carboxylase (fRBC). This cloning step cre¬ 
ates two orientations for the RNA4 cDNA. In one case, the RNA that is transcribed 
is translated into coat protein (sense RNA), and in the other case, the transcribed 
RNA is complementary to the mRNA for the coat protein (antisense RNA). 


replicating and subsequently damaging plant tissues by creating trans¬ 
genic plants that synthesize antisense RNA that is complementary to viral 
coat protein mRNA. 
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In one of many studies, the efficacies of the viral coat protein gene and 
antisense RNA approaches were compared by cloning the cDNA for the 
coat protein of cucumber mosaic virus (CuMV) into tobacco plants in two 
orientations (sense and antisense; one orientation per plant) and then 
testing transgenic plants for sensitivity to viral infection (Fig. 19.11). The 
genome of CuMV consists of three separate single-stranded pieces of RNA, 
each coding for a specific viral protein. In vivo, one of these pieces, RNA3, 
is processed to remove a portion of its sequence, thereby generating RNA4, 
which encodes the viral coat protein. To create transgenic plants that either 
produced normal mRNA and expressed the viral coat protein or produced 
its antisense RNA, the following steps were carried out: 

1. Isolation of RNA4 

2. In vitro enzymatic conversion of RNA4 into a double-stranded 
cDNA 

3. Addition of linkers onto the cDNA 

4. Insertion of the full-length cDNA sequences into cloning vectors in 
both orientations, with each oriented sequence under the control of 
the 35S promoter sequence from cauliflower mosaic virus and the 
termination-regulatory sequences from the plant gene for the small 
subunit of ribulose bisphosphate carboxylase 

5. Formation of separate transgenic plants carrying the cDNAsequence 
in one of the two possible orientations 

The Ti plasmid binary vector system was used to transfer both protein- 
producing sense and antisense RNA-producing cDNA sequences to sepa¬ 
rate tobacco cells, from which transgenic plants were regenerated (Fig. 
19.12). The transgenic tobacco plants that expressed the CuMV coat protein 
were protected from viral-particle accumulation and did not show symp¬ 
toms of viral infection, regardless of whether the inoculum of the challenge 


FIGURE 19.12 Ti plasmid binary cloning vectors containing either the protein-pro¬ 
ducing sense (A) or the RNA-producing antisense (B) orientation of the CuMV coat 
protein cDNA. Each cDNA sequence is under the control of the 35S promoter (P 35S ) 
from cauliflower mosaic virus and the transcription terminator-polyadenylation 
site (fRBC) from the gene for the small subunit of ribulose bisphosphate carboxy¬ 
lase. The vector also contains a neomycin phosphotransferase (NPT) gene under the 
control of nopaline synthase transcription signals (pNOS and fNOS), an Spc r gene, 
a T-DNA right-border sequence, a T-DNA left-border sequence, and a broad-host- 
range origin of DNA replication (on). The protein-producing sense (+) orientation 
is shown by the A—»Z arrow, and the RNA-producing antisense (-) orientation is 
shown by the Z —>A arrow. 
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virus was high or low, whereas the antisense orientation construct pro¬ 
tected transgenic plants only against low viral doses. 

Several groups of scientists have constructed transgenic plants that syn¬ 
thesize antisense RNA copies of viral coat protein genes and tested whether 
these plants can withstand a viral challenge. In all instances, the plants were 
protected against the invading virus only when low concentrations of the 
virus were used. At high concentrations, the plants were damaged by the 
virus. In addition, antisense RNA copies of viral coat protein genes gener¬ 
ally afforded a much lower level of protection to transgenic plants than did 
sense versions of the viral coat protein genes. Although the antisense RNA 
approach may not be an effective means of creating virus-resistant plants, it 
maybe possible to use small interfering RNA (double-stranded RNAs about 
21 nucleotides long) to protect plants against invading viruses. In this case, 
the interfering RNA would act to target specific mRNAs (e.g., mRNAs 
encoding viral coat proteins) for nuclease digestion. 

Often field crops are exposed to several different viruses, any one of 
which may damage the plant and lower the final yield. Ideally, transgenic 
plants should be resistant to more than one virus. With this in mind, Ti 
plasmid binary vectors expressing one or more coat protein genes for CuMV, 
zucchini yellow mosaic virus, and watermelon mosaic virus 2 were used to 
transform yellow crookneck squash (Cucurbita pepo) plants (Fig. 19.13). 
Transgenic plants that contained the coat protein genes from all three viruses 
were resistant to damage by all three viruses under laboratory conditions. 
Initially, transgenic plants expressing coat protein genes for zucchini yellow 
mosaic virus and watermelon mosaic virus 2 were tested under field condi¬ 
tions by using aphids, which are small insects that naturally transmit these 
viruses to developing plants. The transgenic plants that expressed both coat 
protein genes were completely resistant to infection when the two viruses 
were transmitted at the same time (Fig. 19.14). On the other hand, while 
transgenic plants expressing only one of the two viral coat proteins were 


FIGURE 19.13 (A) A T-DNA construct with a neomycin phosphotransferase (NPT II) 
gene as a selectable marker, a p-glucuronidase (GUS) gene as a reporter gene, two 
copies of the coat protein gene from watermelon mosaic virus 2 (WMV 2), and the 
coat protein gene from CuMV. The left and right borders of the T-DNA are indicated 
by LB and RB, respectively. (B) Similar to panel A without CuMV and GUS, with 
one copy of WMV 2, and with the coat protein gene from zucchini yellow mosaic 
virus (ZYMV). (C) Same as panel B with the addition of CuMV. All of the genes in 
these constructs include both promoters and transcription terminator regions. 
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FIGURE 19.14 Disease frequencies in transgenic and nontransformed (wild-type) 
yellow crookneck squash in the field. Aphids were used to transmit a mixture of 
zucchini yellow mosaic virus (ZYMV) and watermelon mosaic virus 2 (WMV) to 
the squash plants. Adapted from Fuchs and Gonsalves, Bio/Technology 13:1466-1473, 
1995. 


delayed in succumbing to the viruses in comparison with the nontrans¬ 
formed control plants, all of these plants eventually developed severe symp¬ 
toms of viral disease, making them unfit for sale to consumers. 

More recently, transgenic squash plants that express viral coat protein 
genes for zucchini yellow mosaic virus, watermelon mosaic virus 2, and 
CuMV have been tested in the field. Following the demonstration that they 
effectively protected plants against disease caused by any or all of these 
viruses, these transgenic squash plants were made commercially available. 
The increase in plant yield resulting from virus protection depends upon 
which viruses the plants are challenged with, how great the viral pressure 
is, and the time of the growing season. Despite these many variables, one 
study estimated that transgenic squash with resistance to these three 
viruses, subjected to severe viral pressure, could produce as much as a 
50-fold increase in marketable squash over nontransgenic varieties. Clearly, 
using more than one viral coat protein gene is an effective strategy that 
should be useful in developing a range of transgenic plants that are resis¬ 
tant to all of the major viruses that normally inhibit their growth and devel¬ 
opment. Flowever, it must be borne in mind that in order to satisfy the 
variety of consumer tastes, there are a large number of squash varieties, all 
of which are potentially susceptible to these viruses, and with this approach, 
all of them would have to be genetically engineered in a similar manner. In 
addition, those transgenic lines that are resistant to three different viruses 
are still susceptible to papaya ringspot virus type W, so "complete" viral 
protection of summer squash will require the introduction of the viral coat 
protein for this virus, as well. 

The phenomenon of using a plant-encoded viral gene to disrupt the 
virus life cycle and thereby confer resistance to the virus is sometimes 
called homology-dependent gene silencing (formerly called cosuppres¬ 
sion). In homology-dependent gene silencing, the addition of new copies of 
a gene to the genome inhibits expression of both the introduced gene and 
the previously present endogenous copies or, in the case of viral genes, 
those genes that are synthesized after infection. In fact, in some cases, the 
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plant's own defense mechanisms may include the possibility of homology- 
dependent gene silencing. 

Perhaps the most successful employment of the viral coat protein 
strategy to prevent damage from a plant virus is the use of the coat protein 
of papaya ringspot virus to protect papaya plants against the virus. In fact, 
these transgenic papaya plants are widely credited with saving the 
Hawaiian papaya industry (Box 19.3). 

Protection by Expression of Other Genes 

RNase III. Engineered resistance to plant viruses—generally as a result of 
expressing a viral coat protein or other viral gene in the transgenic plant— 
is usually an effective strategy only against closely related viruses. Since 
there are a large number of different viruses that could potentially infect a 
crop, it would be advantageous if plants could be engineered to be resistant 
to a broad spectrum of viruses. To do this, a strain of wheat was engineered 
to express the E. coli gene (me) for ribonuclease (RNase) III, an enzyme that 
cleaves only double-stranded RNA; most plant viruses have double- 
stranded RNA as their genetic material. When tested, transgenic plants that 
expressed the me gene were resistant to several different RNA plant 
viruses. Unfortunately, plants that expressed this gene were often stunted 
and did not develop normally. This was probably a result of the interaction 
between the plant RNA and the enzyme. To overcome this problem, a 
mutant of RNase III was used. The mutant enzyme was still able to bind 
stoichiometrically to double-stranded RNA, but it no longer cleaved this 


BOX 19.3 


Saving the Hawaiian 
Papaya Industry 

P apaya (Carica papaya) is an impor¬ 
tant tropical fruit crop grown in 
Brazil, India, Mexico, and Thailand (as 
well as several other tropical coun¬ 
tries) and in Hawaii. It is valued as a 
healthy food because it is rich in vita¬ 
mins C and A and because it contains 
large amounts of the proteolytic 
enzyme papain, which is potentially 
an aid to the digestion of proteins. The 
tree is relatively easy to grow from 
seeds, and the first fruit can be har¬ 
vested a few months after the seeds 
are sown. Thereafter, fruit is produced 
continuously on a year-round basis. 

Papaya ringspot virus (PRSV) is a 
potyvirus that is transmitted by 
aphids. In addition to papaya, the 
virus also infects a number of cucur¬ 
bits (e.g., squash, watermelon, and 
cucumber). The viral RNA genome 
consists of approximately 10,000 
nucleotides, which exist as a single 


strand. The genome is monocistronic, 
so it is expressed as a single large 
polypeptide that is subsequently pro¬ 
cessed into several different functional 
proteins. When PRSV was discovered 
to be present on the Hawaiian island 
of Oahu in the late 1950s, the papaya 
industry was moved over the course 
of several years to the area of Puna on 
the island of Hawaii (sometimes called 
the Big Island). However, by the 
1970s, PRSV was also detected in 
Puna. 

Subsequently, a team of scientists, 
headed by Dennis Gonsalves, devel¬ 
oped (from the commercial cultivar 
called Sunset) a line of transgenic 
papaya (called 55-1) which expressed 
the coat protein gene of PRSV. At the 
time that this work was undertaken, 
papaya had not been genetically trans¬ 
formed, and thus, the coat protein was 
introduced by microprojectile bom¬ 
bardment. The homozygous (for the 
coat protein gene) version of this 
strain (now called UH SunUP) was 


shown to be highly resistant to PRSV 
under field conditions. The UH 
SunUP strain was then crossed with 
the nontransgenic Kapoho strain 
(which is the dominant strain of 
papaya grown in Hawaii) to create a 
hybrid strain called UH Rainbow. 
Importantly, while both the transgenic 
and nontransgenic papaya strains 
could be infected with PRSV, the 
transgenic strains have remained resis¬ 
tant to the virus for up to (at least) 3 
years. 

The main markets for Hawaiian 
papaya are the mainland United 
States, Canada, and Japan. These 
transgenic plants have been approved 
for use in the United States and 
Canada, and it is expected that they 
will receive Japanese regulatory 
approval shortly. The success with 
papaya demonstrates that the 
approach of using coat protein-media¬ 
ting protection may be both a safe and 
an efficacious way to develop protec¬ 
tion from virus for a range of crops. 






substrate (Fig. 19.15). The mutant gene (rnc70) was introduced into wheat, 
under the control of a com ubiquitin gene promoter (Fig. 19.16), by micro¬ 
projectile bombardment. Transgenic plants that expressed mutant RNase 
III developed normally and exhibited a high level of resistance to infection 
by barley stripe mosaic virus. In this instance, binding of the mutant RNase 
III to replicating barley stripe mosaic virus prevented viral replication. In 
addition to being useful with RNA viruses, this approach should be an 
effective strategy for eliminating viroid infection of plants. Viroids are dis¬ 
ease-causing agents with a circular single-stranded RNA genome that con¬ 
tains double-stranded regions formed by intrastrand base pairing. Plant 
viroids are difficult to control because they do not encode any proteins; 
therefore, the viroid nucleic acid must be targeted. 

Pokeweed antiviral protein. In addition to "immunizing" plants against 
damage from viruses by expressing viral proteins in the plant cells, protec¬ 
tion can be conferred by antiviral plant proteins. For example, pokeweed 
(Phytolacca americana) has three antiviral proteins in its cell wall: pokeweed 
antiviral protein (PAP), which is found in spring leaves; PAPII, which is 
found in summer leaves; and PAP-S, which appears in seeds. Although 
they are only 40% identical at the protein level, and antibodies directed 
against PAP do not react with PAPII, they employ similar modes of action. 
Both PAP and PAPII are ribosome-inactivating proteins that remove a spe¬ 
cific adenine residue from the large ribosomal RNA of the 60S subunit of 
eukaryotic ribosomes. When pokeweed plants are infected with viruses, 
either PAP or PAPII is synthesized, depending on the season, and the ribo¬ 
somes in the infected cells are inactivated. Based on their mode of action, 
PAP and PAPII are good candidates for developing transgenic plants that 
are resistant to a broad spectrum of plant viruses. 

After a cDNA encoding PAP was isolated, it was introduced, under the 
transcriptional control of the 35S promoter, into tobacco and potato plants 
with binary Ti plasmid vectors. Transformants that expressed a high level 
of PAP (>10 ng/mg of protein) had a stunted and mottled appearance and 
were sterile. On the other hand, plants with a lower level of PAP (1 to 5 ng/ 
mg of protein) were normal in appearance and fertile. Thus, above a certain 
level, PAP interferes with normal cellular functioning. In transgenic plants 


FIGURE 19.15 Binding of the native form (A) and the mutant form (B) of E. coli RNase 
III to double-stranded RNA. The native form of the enzyme cleaves the RNA, while 
the mutant form does not. 
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FIGURE 19.16 A portion of the genetic construct used to transform wheat plants with 
the gene for the mutant form of E. coli RNase III ( mc70). The construct includes the 
corn ubiquitin gene (Ubi) promoter, the first exon and intron in front of a mutant 
form of E. coli RNase III, and a transcription terminator (TT) region from a nopaline 
synthase gene. 


expressing PAP, the major effect of the antiviral protein was to reduce the 
number of lesions in virus-infected plants. When transgenic tobacco and 
potato plants expressing low levels of PAP were challenged with either 
potato virus X or Y, they developed significantly fewer lesions on their 
leaves than non transformed control plants. 

Transgenic plants that contained the cDNA for PAPII expressed the 
protein at a much higher level than that observed for PAP (up to 250 ng/ 
mg of protein). Plants with >150 ng of PAPII per mg of protein had chlo¬ 
rotic lesions, while those with 10 to 100 ng of PAPII per mg of protein were 
normal. Transgenic plants that expressed the lower level of PAPII and were 
otherwise normal were resistant to tobacco mosaic virus, potato virus X, 
and the fungal pathogen Rhizoctonia solani. While this gene is highly effec¬ 
tive in the laboratory, it remains to be seen how it will function under field 
conditions. 

Single-chain antibodies. One way to protect plants against viral infection 
is to engineer the plants to produce antibodies that are directed against the 
invading viruses. This was done, with some success, by expressing single¬ 
chain Fv antibodies directed against tobacco mosaic virus in tobacco plants. 
However, as a consequence of the variability between coat proteins from 
different viruses, this strategy is not useful for providing broad-range resis¬ 
tance against several different viruses. 

The majority of plant viruses are RNA viruses, and many of them con¬ 
tain positive-stranded RNA as the genetic material. These viruses all encode 
RNA-dependent RNA polymerases that are essential for their replication. 
Thus, a single-chain Fv antibody that recognizes epitopes that are common 
to the RNA-dependent RNA polymerases from several different viruses 
should be an effective means of inhibiting the replication of all of these 
viruses, thereby making transgenic plants that express these single-chain Fv 
antibodies resistant to these viruses (Fig. 19.17). That is because the antibody 
fragment can bind to RNA polymerases and thereby block their activities. In 
addition, since even in virus-infected cells RNA-dependent RNA poly¬ 
merases are found in only low concentrations, a high level of antibody 
expression is not required. When two phage display libraries were screened 
against purified denatured fragments of the RNA-dependent RNA poly¬ 
merase of tomato bushy stunt virus, the three single-chain Fv antibodies 
that displayed the highest affinity were isolated and characterized. A cDNA 
encoding one of these single-chain Fv antibodies was expressed in Nicotiana 
benthamiana (a close relative of tobacco, native to Australia). The resultant 
transgenic plants were significantly protected against tomato bushy stunt 
virus and cucumber necrosis virus and partially resistant to turnip crinkle 
virus and red clover necrotic virus. In addition, the single-chain Fv antibody 
directed against tomato bushy stunt virus bound to the RNA-dependent 
RNA polymerase of hepatitis C virus, which is much more distantly related. 
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FIGURE 19.17 Schematic representation of the inhibition of tomato bushy stunt virus 
RNA-dependent RNA polymerase by a single-chain Fv antibody. Tomato bushy 
stunt virus has a genome of 4,775 nucleotides that encodes five separate proteins, 
one of which is the viral RNA-dependent RNA polymerase. 


This work represents an important first step in using a simple and straight¬ 
forward strategy to develop transgenic plants that are resistant to a wide 
range of different viruses. 

Micro-RNAs. One approach to developing plants that are resistant to a 
range of different viruses might include engineering the plants to produce 
micro-RNAs (miRNAs) that interfere with viral replication by targeting the 
viral RNA (or the viral mRNA) for degradation. In a recent series of exper¬ 
iments, starting with a 273-nucleotide precursor of a naturally occurring 
plant miRNA, scientists used PCR to replace a small portion of the existing 
sequence so that the precursor could be processed to yield a 20- to 24-nucle- 
otide-long miRNA that was complementary to viral RNA (Fig. 19.18). The 
newly synthesized artificial miRNA (amiRNA) became part of an RNA- 
induced silencing complex (see chapter 11) in which the viral RNA (mRNA) 
was specifically bound and cleaved. It is also possible to clone two or more 
different pre-amiRNAs in tandem—this was done using turnip yellow 
mosaic virus and turnip mosaic virus—so that plants transformed with this 
construct become resistant to two separate viruses. This approach can be 
made even more effective by targeting more than one portion of each viral 
RNA using several pre-amiRNAs. Despite its intriguing possibilities, this 
system still requires a considerable amount of development before it is 
shown to be effective under field conditions. 


Herbicide Resistance 

A significant fraction of global crop production is lost through weed infesta¬ 
tion every year, despite the expenditure of $10 billion on more than 100 
different chemical herbicides. In addition, many herbicides do not discrimi¬ 
nate weeds from crop plants; others must be applied early, before the weeds 
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FIGURE 19.18 Use of amiRNAs to provide protection against viruses. The 273-nucle- 
otide-long precursor (labeled pre-mR159a) of a naturally occurring miRNA, which 
is 20 to 24 nucleotides long, is cloned, and then PCR is used to alter a portion of the 
pre-miRNA sequence. The modified amiRNA sequence is cloned into a binary 
vector under the control of the 35S promoter, which is then used to transform 
Arabidopsis plants that now produce a pre-amiRNA that, when it is processed, tar¬ 
gets a specific viral RNA for cleavage. Adapted from Niv et al., Nat. Biotechnol. 
24:1420-1428, 2006. 


take hold; and some persist in the environment. The creation of herbicide- 
resistant crop plants is one way to overcome some of these drawbacks. 

A number of different biological manipulations that would cause a 
crop plant to be herbicide resistant can be envisioned. 

1. Inhibit uptake of the herbicide. 

2. Overproduce the herbicide-sensitive target protein so that enough 
of it remains available for cellular functions despite the presence of 
the herbicide. 

3. Introduce a bacterial or fungal gene that produces a protein that is 
not sensitive to the herbicide but performs the same function as the 
plant (herbicide-sensitive) protein. 

4. Reduce the ability of a herbicide-sensitive target protein to bind to 
a herbicide. 

5. Endow plants with the capability to metabolically inactivate the 
herbicide. 

A number of these strategies have been implemented to produce herbicide- 
resistant transgenic plants (Table 19.3). This approach has been so successful 
that more than 75% of the transgenic crops that are currently planted world¬ 
wide have been engineered to be herbicide resistant. By far, the most widely 
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TABLE 19.3 Some examples of gene-based herbicide resistance 

Herbicide(s) Mode of development of herbicide resistance 

Triazines Resistance is due to an alteration in the psbA 

gene, which codes for the target of this herbi¬ 
cide, chloroplast protein D-l. 

Sulfonylureas Genes encoding resistant versions of the enzyme 

acetolactate synthetase have been introduced 
into poplar, canola, flax, and rice. 

Imidazolinones Strains with resistant versions of the enzyme 

acetolactate synthetase have been selected in 
tissue culture. 


Aryloxphenoxypropionates, 

cyclohexanediones 


Glyphosate 


Bromoxynil 


These herbicides inhibit the enzyme acetyl coen¬ 
zyme A carboxylase. Resistance, selected in 
tissue culture, is due either to an altered 
enzyme that is not herbicide sensitive or to 
the degradation of the herbicide. 

Resistance is from overproduction of EPSPS, the 
target of this herbicide. Resistance has been 
engineered by transforming soybean with the 
gene for a glyphosate-resistant EPSPS and 
tobacco with a glyphosate oxidoreductase 
gene, which encodes an enzyme that degrades 
glyphosate. 

Resistance to this photosystem II inhibitor has 
been created by transforming tobacco and 
cotton plants with a bacterial nitrilase gene, 
which encodes an enzyme that degrades this 
herbicide. 


Phenoxycarboxylic acids Resistant cotton and tobacco plants have been 

(e.g., 2,4-D and 2,4,5-T) created by transformation with the tfdA gene 

from Alcaligenes, which encodes a dioxyge¬ 
nase that degrades this herbicide. 

Glufosinate Over 20 different plants have been transformed 

(phosphinothricin) with either the bar gene from Streptomyces 

hygroscopicus or the pat gene from 
S. viridochromogenes. The phosphinothricin 
acetyltransferase that these genes encode 
detoxifies this herbicide. 

Cyanamide Resistant tobacco plants were produced when a 

cyananide hydratase gene from the fungus 
Myrothecium verrucaria was introduced. The 
enzyme encoded by this gene converts cyana¬ 
mide to urea. 

Dalapon Tobacco plants transformed with a dehalogenase 

gene from Pseudomonas putida can detoxify 
this herbicide. 


used herbicide is glyphosate, which is considered to be safe, cheap, effec¬ 
tive, and "environmentally friendly" because it is readily degraded to non¬ 
toxic compounds in the soil. Glyphosate, trademarked as Roundup by the 
Monsanto Corporation, inhibits a key enzyme in the shikimate pathway, 
5-enolpyruvylshikimate-3-phosphate synthase (EPSPS), that plays an impor¬ 
tant role in the synthesis of aromatic amino acids in both bacteria and 
plants. Plants resistant to this herbicide have been developed by putting an 
EPSPS-encoding gene from a glyphosate-resistant strain of E. coli under the 
control of plant promoter and transcription termination-polyadenylation 
sequences and cloning the construct into plant cells. Transgenic soybean. 
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com, canola, tobacco, petunia, tomato, potato, and cotton plants that pro¬ 
duce an amount of the resistant E. coli EPSPS sufficient to replace the inhib¬ 
ited plant enzyme are resistant to the effects of glyphosate. Thus, in these 
cases, the crop plant would not be affected by glyphosate treatment, whereas 
the weeds would be. Crops that have been engineered to be resistant to 
glyphosate by this approach are said to be "Roundup ready" 

Notwithstanding the many years of successful use of glyphosate and 
Roundup-ready plants, two important factors are now changing people's 
thinking about this approach. In the first instance, the herbicide patent has 
now expired and other companies are very actively pursuing the develop¬ 
ment of plants that are resistant to glyphosate using other approaches. 
Secondly, there is a realization that worldwide agriculture has become too 
dependent upon a single herbicide and that alternative strategies need to 
be developed. 

To find an enzyme that can inactivate glyphosate, one group of 
researchers assayed a collection of several hundred Bacillus sp. strains for 
the ability to acetylate glyphosate (Fig. 19.19). The assay was based on the 
ability to measure N-acetylglyphosate in the supernatant of permeabilized 
cells (Fig. 19.20). The three strains (all Bacillus licheniformis) that had the 
highest level of glyphosate N-acetyltransferase activity were isolated, and 
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FIGURE 19.19 N-acetylation of the herbi¬ 
cide glyphosate by the bacterial enzyme 
glyphosateN-acetyltransferase. CoASH, 
coenzyme A. 


FIGURE 19.20 Overview of a scheme to isolate a bacterial enzyme with a sufficient 
level of glyphosate N-acetyltransferase activity to allow it to be used to engineer 
plants so that they are resistant to high levels of glyphosate. 





























the enzymes were characterized. In all cases, these strains exhibited only a 
very low level of enzyme activity. Subsequently, the genes from each of the 
three selected strains encoding glyphosate N-acetyltransferase activity 
were isolated. These genes were then shuffled (see chapter 8) numerous 
times, each time selecting the strain with glyphosate N-acetyltransferase 
with the highest level of activity. In fact, following 11 iterations of DNA 
shuffling, the enzyme catalytic efficiency (see chapter 8) improved by 
nearly 10,000-fold. Interestingly, while the modified enzyme still functions 
as a glyphosate N-acetyltransferase, after so many rounds of modification, 
the amino acid sequence of the modified enzyme is only 76 to 79% identical 
to the amino acid sequences of the parental enzymes. Finally, the modified 
glyphosate N-acetyltransferase gene was introduced into Arabidopsis, 
tobacco, and corn plants. The transgenic plants, which expressed the 
enzyme in the plant cytosol and were both morphologically normal and 
fertile, were tolerant of approximately six times the dose of glyphosate that 
killed the parental non transformed plants. This work is an important first 
step in developing plants that can act as an alternative to Roundup-ready 
plants. However, the efficacy of this approach remains to be proven in the 
field. 

The herbicide dicamba has been used since the 1960s to control a wide 
range of broadleaf weeds. When it is applied to dicotyledonous plants, 
dicamba acts by mimicking the effects of high levels of the plant hormone 
indole-3-acetic acid and binding to indole-3-acetic acid receptors, which are 
essential for normal growth and development of the plant. The herbicide is 
widely used, relatively inexpensive, and environmentally friendly in that it 
does not persist in soils and has no toxicity to humans or other animals. 
Moreover, the widespread use of the herbicide has not led to the develop¬ 
ment of any dicamba-resistant weeds. Researchers have therefore sought to 
develop crop plants that are resistant to dicamba. To do this, a dicamba 
monooxygenase gene was expressed in Arabidopsis, tomato, and tobacco 


FIGURE 19.21 (A) Conversion of dicamba to 3,6-dichlorosalicylic acid by dicamba 
monooxygenase. (B) The genetic construct used to express the dicamba monooxy¬ 
genase gene within the chloroplasts of transgenic plants. The promoter was from 
peanut chlorotic streak virus, the enhancer was from tomato etch virus, the transit 
peptide was from the small subunit of pea ribulose 1,6-bisphosphate carboxylase, 
the diooxygenase gene was from the soil bacterium P. maltophilia, and the termi¬ 
nator sequence was from the small subunit of pea ribulose 1,6-bisphosphate car¬ 
boxylase. Adapted from Behrens et al., Science 316:1185-1188, 2007. 



Dicamba 


3,6-Dichlorosalicylic acid 


B 


Promoter 


Transit 

Enhancer peptide Dicamba monooxygenase gene Terminator 











Engineering Plants To Overcome Biotic and Abiotic Stress 


787 


plants (all as test systems). The dicamba monooxygenase is part of the 
three-component enzyme dicamba O-demethylase, from the bacterium 
Pseudomonas maltophilia, that converts dicamba to 3,6-dichlorosalicylic acid, 
a compound without any appreciable herbicidal activity (Fig. 19.21). In 
transgenic plants, only dicamba monooxygenase is needed for the inactiva¬ 
tion of the herbicide, since the enzyme can be targeted to the chloroplast, 
where there is a ready source of reduced ferredoxin (the product of the 
other two genes in the dicamba O-demethylase complex). The reduced 
ferredoxin supplies electrons for the monooxygenase reaction. As expected, 
transgenic plants expressing dicamba monooxygenase are resistant to high 
levels of the herbicide when grown in both the greenhouse and the field. It 
is speculated that it may be possible to "stack" plants with genes encoding 
both glyphosate and dicamba resistance so that farmers can either alternate 
the use of the two herbicides or else apply them at the same time. In this 
way, it is anticipated that unwanted weeds are unlikely to develop resis¬ 
tance to both herbicides, while the transgenic crop plant is uninhibited by 
the herbicide. 

Resistance due to inactivation of bromoxynil (3,5-dibromo-4-hydroxy- 
benzonitrile), a herbicide that acts by inhibiting photosynthesis, has been 
achieved for some plants. In this case, resistant plants were created by the 
introduction of a bacterial gene that encodes the enzyme nitrilase, which 
can inactivate bromoxynil before the herbicide can act (Fig. 19.22). The gene 
for nitrilase was isolated from the soil bacterium Klebsiella ozaenae and 
placed under the control of the light-regulated promoter from the small 
subunit of the enzyme ribulose bisphosphate carboxylase before it was 
transferred to tobacco plants. As expected, the transgenic plants expressed 
nitrilase activity in their shoots and leaves, but not in their roots, and were 
resistant to the toxic effects of the herbicide. 


Fungus and Bacterium Resistance 

Extensive damage and loss of crop productivity are caused by phytopatho- 
genic fungi. It has been estimated that one fungal disease of one major crop, 
i.e., fungal rice blast, a disease that affects rice plants, costs farmers in 
Southeast Asia, Japan, and the Philippines more than $5 billion per year. At 
present, the major way of controlling the damage and losses to crop plants 
that result from fungal infection is through the use of chemical agents that 
may persist and accumulate in the environment and that are subsequently 
hazardous to animals or humans. It would therefore be beneficial if a 


FIGURE 19.22 Detoxification of the herbicide bromoxynil by the enzyme nitrilase 
from K. ozaenae. 
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plant by a pathogenic fungus or bacterium (shown in blue), the inactive storage 
compound salicylic acid 2-O-p-D-glucoside is converted to salicylic acid and/or 
salicylic acid is synthesized. The salicylic acid activates or induces the NPR1 gene, 
whose protein product acts as a "master" regulatory protein to turn on the expres¬ 
sion of the PR proteins, which have enzyme activities directed against various 
pathogenic organisms. 


simple, inexpensive, effective, and environmentally friendly nonchemical 
means of preventing fungal damage to crop plants could be found. 

Plants often respond to fungal or bacterial pathogen invasion or other 
environmental stresses by converting a conjugated storage form of salicylic 
acid (salicylic acid 2-0-(3-D-glycoside) to salicylic acid, which induces a 
broad systemic defense response in the plant. This "systemic acquired 
resistance" to pathogens extends to plant tissues that are far from the site 
of the initial infection and may last for weeks to months. It results from the 
synthesis of a group of proteins called pathogenesis-related (PR) proteins 
(Fig. 19.23). The PR proteins include (3-1,3-glucanases, chitinases, thau- 
matin-like proteins (thaumatin is a small, very sweet protein), and protease 
inhibitors that protect the plant-invading pathogens. To develop plants 
resistant to fungal pathogens, researchers have attempted to utilize parts of 
the systemic acquired resistance system. For example, transgenic plants 
that constitutively express high levels of one or more PR proteins, such as 
chitinase, which can hydrolyze the (3-1,4 linkages of the N-acetyl-D- 


FIGURE 19.24 Plasmid vector containing a rice chitinase gene cassette and a hygro- 
mycin resistance gene cassette (in both cases including transcriptional regulatory 
sequences) used to transform rice protoplasts. Rice cell protoplasts were trans¬ 
formed by polyethylene glycol treatment in the presence of this plasmid. Transformed 
cells were selected for their resistance to hygromycin. Later, they were tested for the 
presence of chitinase genes by Southern hybridization and for chitinase by Western 
blot analysis; then, they were used to regenerate transgenic plants. 
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glucosamine polymer chitin, a major component of many fungal cell walls, 
have been engineered (Fig. 19.24). 

The NPR1 gene from the plant A. thaliana encodes a "master" regula¬ 
tory protein that controls the expression of the PR proteins, and it can be 
activated or induced by the addition of salicylic acid. In A. thaliana, overex¬ 
pression of the NPR1 gene can lead to the generation of broad-spectrum 
disease resistance against both fungal and bacterial pathogens. Moreover, 
scientists have observed that overproduction of this "master switch" is an 
effective strategy in several plants other than A. thaliana, including rice, 
sugar beet, apple, and corn. 

Another approach to engineering plants with broad-spectrum disease 
resistance involves overproducing salicylic acid. Theoretically, this can be 
done by transforming plants with bacterial genes that encode the enzymes 
isochorismate synthase and isochorismate pyruvate lyase, which catalyze 
salicylate synthesis (Fig. 19.25). Salicylate is synthesized from chorismate, 
which is produced in large amounts in the chloroplast and is also an inter¬ 
mediate in the biosynthesis of the amino acid tryptophan. The two bacterial 
genes for salicylate synthesis were fused to chloroplast-targeting sequences 
from the gene for the small subunit of ribulose bisphosphate carboxylase— 
the small subunit of ribulose bisphosphate carboxylase is encoded within 
the nuclear DNA, but following its synthesis, this protein is transmitted to 
the chloroplast (Fig. 19.26). The result of this genetic manipulation was that 
when both of these enzymes were localized in the (tobacco) plant chloro¬ 
plast, salicylic acid was produced constitutively. Consequently, the plants 
constitutively expressed a number of PR proteins. The plants appeared 
normal but exhibited enhanced resistance to both viral and fungal patho¬ 
gens. Since it is not necessarily advantageous to the plant to constitutively 
express PR proteins, there is some question as to how effective this strategy 
of conferring protection against a broad range of pathogens will be in the 
field. 

Transgenic plants that have been engineered to constitutively express 
chitinase under the control of the cauliflower mosaic virus 35S promoter 
include rice, tobacco, and canola. Transgenic plants that expressed chi¬ 
tinase were more resistant to damage by fungal pathogens than control 
plants, even though the control plants synthesized their own PR proteins 
in response to the fungal infection. Presumably, this resistance reflects the 
higher level of chitinase expressed by the transgenic plants than by non- 


FIGURE 19.25 Use of bacterial enzymes to convert plant chloroplast chorismate to 
salicylate. 
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FIGURE 19.26 Construct used to transform plants so that they constitutively overpro¬ 
duce salicylate. P 35S , the 35S promoter from cauliflower mosaic virus; CTS, chloro- 
plast targeting sequence; ICS, isochorismate synthase; TT, transcription termination 
region; IPL, isochorismate pyruvate lyase. 

transformed plants. In addition, while transgenic plants constitutively 
expressing chitinase were resistant to fungal pathogens, binding of the 
beneficial root fungus Glomus mosseae to the plant roots was not affected. 
This was probably a consequence of a difference in the cell wall composi¬ 
tions of different fungi. Importantly, a transgenic plant constitutively 
expressing chitinase has been found to be effective at resisting fungal 
damage under field conditions. In a variation on the strategy described 
above, a cDNA encoding chitinase from the biocontrol fungus Trichoderma 
harzianum was isolated and introduced, under the control of the 35S pro¬ 
moter, into tobacco and potato plants. As expected, these transgenic plants 
were resistant to both soil-borne fungal pathogens (primarily affecting 
roots) and foliar fungal pathogens (primarily affecting shoots and leaves). 
In sum, the overexpression of some PR-like proteins, such as chitinase, 
appears to be an effective strategy for protecting plants against damage 
from pathogenic fungi. 

Pathogenic fungi belonging to the genus Fusarium are the causative 
agents of some of the most costly and devastating plant diseases in the 
world. Therefore, a strategy that targeted various Fusarium spp. would be 
quite important for agriculture worldwide. A number of different antimi¬ 
crobial peptides (which can disrupt the cell membrane) and the enzyme 
chitinase (mentioned above) are inhibitory to the growth of Fusarium spp. 
However, these biological approaches (regardless of how they are adminis¬ 
tered) are not as effective as spraying plants with chemical fungicides. To 
overcome this limitation, workers have fused the genes (cDNAs) for two 
different antimicrobial peptides (one from the radish Raphanus sativus and 
one from the mold Aspergillus giganteus) and a chitinase (from wheat) to a 
single-chain Fv antibody that binds to a Fusarium cell wall protein (Fig. 
19.27). Although the single-chain Fv antibody was originally selected from 
a library constructed from chickens that had been immunized with 

FIGURE 19.27 Schematic representation of three different anti-fungal fusion proteins. 
Each protein includes a single-chain Fv antibody that binds to a Fusarium sp. cell 
wall protein and an antifungal peptide/protein directed against either the cell 
membrane or the chitin component of the cell wall. Adapted from Bohlmann, Nat. 
Biotechnol. 22:682-683, 2004. 
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Fusarium graminearum, the selected antibody cross-reacted with cell wall 
antigens from nine different species and subspecies of Fusarium. Constructs 
encoding the three fusion proteins were used to transform A. thaliana, and 
the transgenic plants were tested for resistance to infection and growth 
inhibition by Fusarium oxysporum. When either the selected single-chain Fv 
antibody or any one of the three antipathogenic peptides/proteins was 
expressed in transgenic plants, it endowed the plants with a low to mod¬ 
erate level of resistance to damage by the pathogen (Table 19.4). However, 
all three of the antibody-peptide/protein fusions conferred a high level of 
resistance to the pathogen, suggesting, in each case, that the two compo¬ 
nents of the fusion protein were acting synergistically. This is an interesting 
and potentially quite useful approach that merits further development. 

The annual worldwide losses to farmers from potato diseases caused 
by the pathogenic soil bacterium Erwinia carotovora are approximately $100 
million. Moreover, potato breeders have not identified any resistance traits 
that can be bred into commercial cultivars. To address this problem, trans¬ 
genic potato plants that actively express bacteriophage T4 lysozyme were 
developed. The lysozyme was targeted for secretion into the apoplast (the 
intercellular spaces inside the plant but outside the plant cells) in potato 
plants, since this is the part of the plant where £. carotovora enters and 
spreads. More specifically, the T4 lysozyme gene was fused to the barley 
a-amylase signal peptide coding sequence and placed under the transcrip¬ 
tional control of the cauliflower mosaic virus 35S promoter, transcription 
terminator, and polyadenylation site. Although the T4 lysozyme gene was 
under the control of this strong promoter, only a very low level of lysozyme 
was synthesized, perhaps reflecting differences in codon usage between a 
bacteriophage gene and the potato genome. This result notwithstanding, 
under laboratory and greenhouse conditions, transgenic plants with this 
construct were significantly protected from damage by high levels of E. 
carotovora. Since much lower levels of the pathogen than were used in these 
laboratory experiments are present in the field, this type of genetic con¬ 
struct should provide a high level of protection under natural conditions. 
To avoid killing plant-beneficial bacteria in the vicinity of the roots, 
researchers have employed hen egg lysozyme instead of T4 lysozyme 
because it is more specific for various phytopathogenic Erwinia spp. 
Moreover, researchers have found this strategy to be useful for protecting 


TABLE 19.4 Resistance of transgenic A. thaliana to the 
phytopathogen F. oxysporum 


Transgene 

Disease index (%) 

None (wild type) 

too 

Fv antibody 

55 

Peptide 1 

50 

Peptide 2 

60 

Chitinase 

52 

Antibody-peptide 1 fusion 

10 

Antibody-peptide 2 fusion 

0 

Antibody-chitinase fusion 

5 


Adapted from Peschen et al., Nat. Biotechnol. 22:732-738, 2004. 
The extent of disease was assessed 2 weeks after infection with 
F. oxysporum. A disease index of 100% indicates that all of the plants 
are dead, 50% indicates that the average plant has disease symp¬ 
toms but is alive, and 0% indicates that all plants are disease free. 
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many different plants, including potato, rice, tomato, and tobacco, from 
various bacterial pathogens. Finally, in addressing the concern that pro¬ 
teins that are present in the apoplast of root cells may be exuded from the 
roots (see chapter 18) and kill plant growth-promoting bacteria, as well as 
pathogenic bacteria, researchers have noted that the endogenous rhizo- 
sphere microbial community (containing many beneficial bacteria) was 
essentially unchanged when a lysozyme transgene was expressed. Despite 
this apparent success, it remains to be determined whether this sort of 
genetic manipulation, which functions well in the laboratory, will be useful 
in the field. 


Oxidative Stress 

Unlike many animals, plants cannot physically avoid adverse environ¬ 
mental conditions, such as high levels of light, ultraviolet (UV) irradiation, 
heat, high salt concentrations, or drought, so physiological strategies have 
evolved to cope with these stresses. At the molecular level, one of the unde¬ 
sirable consequences of physiological stress is the production of oxygen 
radicals. Thus, investigators reasoned that if they could create plants that 
were able to tolerate increased levels of oxygen radicals, these plants 
should also be able to withstand various forms of environmental stress. 

A variety of biotic stresses, including salt, freezing, and drought, as 
well as exposure to pollutants, stimulate the formation of reactive oxygen 
species in plant cells. These toxic molecules damage membranes, mem¬ 
brane-bound structures, and macromolecules, including proteins and 
nucleic acids, especially in the mitochondria and chloroplast, resulting in 
oxidative stress. A common type of potentially damaging oxygen radical is 
the superoxide anion. Within a cell under oxidative stress, the enzyme 
superoxide dismutase detoxifies superoxide anion by converting it to 
hydrogen peroxide, which in turn is broken down to water by various cel¬ 
lular peroxidases or catalases (Fig. 19.28). In one study, tobacco plants that 
were transformed with a superoxide dismutase gene that was under the 
control of the 35S promoter from cauliflower mosaic virus had reduced 
oxygen radical damage under stress conditions compared with control 
plants. 

Plants have several different isoforms of the enzyme superoxide dis¬ 
mutase. The Cu/Zn superoxide dismutases are found primarily in chloro- 
plasts and to a lesser extent in the cytosol. The Mn superoxide dismutase is 
located in the mitochondria, and some plants also have an Fe form of 
superoxide dismutase. Transgenic tobacco plants that carried the cDNA for 
a chloroplast-localized Cu/Zn superoxide dismutase under the control of 
the 35S promoter from cauliflower mosaic virus were much more resistant 
to high-light damage than nontransformed plants. When they were tested, 
the transgenic plants retained 94% of their photosynthetic activity under 


FIGURE 19.28 Conversion of superoxide anion to hydrogen peroxide and then to 
water and oxygen. 
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FIGURE 19.29 Oxidation of glutathione to oxidized glutathione and simultaneous 
reduction of an organic peroxide to an organic alcohol, catalyzed by glutathione 
peroxidase. y-Glu, glutamic acid residue linked to the next amino acid through the 
gamma carboxyl group rather than through the alpha amino group, as in a usual 
peptide linkage; Cys, cysteine; Gly, glycine. 


conditions in which nontransformed plants lost all of their activity. In 
another experiment, transgenic plants with cloned Mn superoxide dis- 
mutase targeted to their chloroplasts were three- to fourfold less sensitive 
to oxidative damage caused by ozone than nontransformed plants. 

Oxidative stress may also be reduced if the level of oxidized gluta¬ 
thione within a plant is increased. Glutathione peroxidase catalyzes the 
conversion of glutathione to oxidized glutathione by reacting with organic 
peroxides and reducing them to organic alcohols (Fig. 19.29). To test this 
idea, a tobacco cDNA encoding an enzyme with both glutathione 
S-transferase and glutathione peroxidase activities was isolated. Transgenic 
tobacco plants that expressed glutathione peroxidase were created using 
the isolated cDNA under the control of the 35S promoter, and the construct 
was introduced into plants with a binary Ti plasmid system. The trans¬ 
formed plants had approximately twice the level of enzyme activity found 
in nontransformed plants. Seedlings of these transgenic plants grew sig¬ 
nificantly faster than control seedlings when exposed to either chilling or 
salt stress. The efficacy of this system remains to be demonstrated in the 
field. 


Salt and Drought Stress 

Many plants live in environments where growth is severely impaired by 
either drought or high salinity. With increasing dependence on irrigation in 
agriculture and more frequent salting of icy and snowy roads in the winter, 
increased soil salinity has become a common problem worldwide. 
Approximately one-third of the world's irrigated land has become unsuit¬ 
able for growing crops because of contamination with high levels of salt. 
Irrigation typically increases the amount of salt present in soil. To survive 
under these conditions, many plants synthesize low-molecular-weight 
nontoxic compounds collectively called osmoprotectants. These com¬ 
pounds facilitate both water uptake and retention and also protect and 
stabilize cellular macromolecules from damage by high salt levels. Some 
well-known osmoprotectants are sugars, alcohols, the amino acid proline. 
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FIGURE 19.30 Conversion of choline to 
glycine betaine. CMO, choline monoox¬ 
ygenase; BADH, betaine aldehyde 
dehydrogenase. 


and quaternary ammonium compounds. To create more salt-tolerant 
plants, scientists have tried to engineer an increase in the cellular accumu¬ 
lation of the following osmoprotectants: trehalose, proline, D-ononitol, 
mannitol, sorbitol, glycine betaine, and 3-dimethylsulfoniopropionate. 

The quaternary ammonium compound betaine is a highly effective 
osmolyte that accumulates in some plants during periods of water stress or 
high salinity. However, several important crops, including potatoes, rice, 
and tomatoes, do not accumulate betaine. Thus, the introduction of betaine- 
biosynthetic enzymes into these plants might enable them to withstand 
water stress and/or high salinity. Betaine is synthesized from choline in 
two steps in both plants and bacteria (Fig. 19.30). In plants, such as spinach, 
choline is converted to betaine aldehyde by the enzyme choline monooxy¬ 
genase and then to betaine by betaine aldehyde dehydrogenase. In bacteria, 
such as £. coli, both steps of betaine biosynthesis are catalyzed by the same 
enzyme, choline dehydrogenase. To create a more salt-tolerant tobacco, 
plant cells were transformed with a Ti plasmid vector carrying the E. coli 
betA gene, which encodes choline dehydrogenase, under the control of the 
cauliflower mosaic virus 35S promoter. In laboratory tests, tobacco plants 
expressing this gene were up to 80% more tolerant of a high (300 mM) salt 
concentration than were nontransformed tobacco plants. While it may be 
possible to improve the osmoprotection afforded by the E. coli betA gene by 
using a plant tissue-specific promoter to direct the expression of the gene, 
this experiment is an important step in the development of plants that are 
more tolerant of high levels of salt. 

It is also possible to increase the trehalose (Fig. 19.31) concentration in 
plants (where trehalose is a natural alpha-linked disaccharide formed by an 
a or a-1 bond between two a-glucose units) and thereby protect the plants 
against inhibition by high levels of salt in the soil. To do this, rice plants 
were transformed, using a binary vector, with one of two different DNA 
constructs (Fig. 19.32). In £. coli, trehalose-6-phosphate is first formed from 
uridine diphosphate (UDP)-glucose and glucose-6-phosphate, and then the 
trehalose-6-phosphate is converted to trehalose. A fusion of the genes 
encoding the two enzymes that normally catalyze the two steps in the bio¬ 
synthesis of trehalose in E. coli was constructed so that a single protein 
contained both activities. This simplifies the transformation of plants in 
that only one target gene needs to be introduced and ensures that the two 
enzyme activities necessary for the synthesis of trehalose are present at 
identical levels. In one genetic construct, the fusion protein gene is under 
the transcriptional control of an abscisic acid-inducible promoter and is 
expressed in the cytosol. In the other construct, it is under the control of the 
promoter for the small subunit of ribulose bisphosphate carboxylase, and 
the fusion protein is expressed in plant chloroplasts. In transgenic rice 


FIGURE 19.31 Structure of trehalose. 
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FIGURE 19.32 Two genetic constructs used to engineer rice plants to synthesize treha¬ 
lose to protect them against growth inhibition by high salt levels. The fusion protein 
is under the control of an abscisic acid (ABA)-inducible promoter where ABA syn¬ 
thesis increases in the presence of salt (A) or the promoter from the small subunit 
of ribulose bisphosphate carboxylase (B). In panel B, the transit peptide facilitates 
the localization of the fusion protein in the chloroplasts. RB, right border; LB, left 
border; UTR, untranslated region. 


plants that contain either DNA construct, the level of trehalose is 3 to 10 
times higher than in nontransformed rice plants in the presence of salt. 
Moreover, the biomass of the transgenic plants is four to six times that of 
the nontransformed plants in the presence of salt. Thus, by increasing the 
amount of trehalose that a plant synthesizes, the plant acquires increased 
tolerance for moderate levels of salt in the environment. 

Researchers have engineered the plant A. thaliana to be salt tolerant by 
sequestering sodium ions in the large intracellular vacuole (Fig. 19.33). The 
strategy consisted of overproducing the endogenous A. thaliana gene 
encoding an Na + /H + antiport protein. The Na + /H + antiport protein trans¬ 
ports Na + into the vacuole using the electrochemical gradient of protons 
generated by vacuolar H + -translocating enzymes. When tested, the trans¬ 
genic plants that overproduced the Na + /H + antiport protein thrived in soil 
that was watered with a solution of 200 mM salt. This approach to the 
manipulation of salt stress in plants is effective with corn, canola, cotton, 
rice, tobacco, and tomato plants, as well as with A. thaliana. In transgenic 
tomato plants, the salt is localized in the leaves, and therefore, the transgenic 
tomato fruits do not accumulate salt and are quite normal in all respects, 
including taste. In addition to Na + toxicity, plants that live in saline environ¬ 
ments have to contend with water loss caused by osmotic stress. By concen¬ 
trating the salt in the plant's large vacuole, water that is free of salt should 
be driven into the plant cells, resulting in plants that use water more effi¬ 
ciently. This system has been quite successful in greenhouse trials and in the 
limited number of field trials where it has been tested. It provides researchers 
with the potential, especially in combination with other approaches, such as 
the overproduction of certain osmolytes, to engineer a wide range of salt- 
tolerant crop plants that can be grown on marginal land or possibly watered 
with seawater or other salt-containing water. 

Many of the strategies that have been used to engineer plants to become 
more salt tolerant are also effective at making the plant drought tolerant; 
however, some strategies are specific for one stress or the other. In fact, a 
very large number of different genes have been employed in attempts to 
create drought-tolerant transgenic plants. These approaches have included 
introducing genes encoding overproduction of various osmolytes (e.g., tre¬ 
halose, proline, glycine betaine, and polyamines), plant stress proteins (e.g., 
chaperones and heat shock proteins), reactive-oxygen-scavenging proteins 
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FIGURE 19.33 Schematic representation of ion transport in the plant A. thaliana 
showing the Na + ions being sequestered in the large vacuole. 

(e.g., superoxide dismutases), hormone biosynthesis and catabolism pro¬ 
teins (e.g., affecting cellular levels of abscisic acid, cytokinin, and ethylene), 
transcription factors that turn on the synthesis of a host of other proteins, 
and signaling proteins that activate the synthesis of other proteins. 

One group of researchers reasoned that in order to increase the toler¬ 
ance of plants for drought, it is necessary to delay the onset of drought- 
induced senescence during the drought episode. Moreover, by suppressing 
this response, plants would be more likely to resume normal growth when 
water became available. Of course, it is necessary to keep in mind that 
regardless of their genetic makeup, plants cannot exist for indefinite periods 
in the absence of water. Prior to this work, it had been observed that leaf 
senescence could be delayed in transgenic plants expressing a foreign gene 
encoding isopentenyltransferase, an enzyme that catalyzes the rate-limiting 
step in cytokinin biosynthesis. Therefore, tobacco plants were transformed 
to express isopentenyltransferase under the control of a SARK (senescence- 
associated protein kinase gene) promoter (Fig. 19.34). This regulatable pro¬ 
moter is induced during late maturation and decreased during the 
development of senescence. When these transgenic tobacco plants were 
watered with only 30% of the amount of water used under normal condi¬ 
tions, the suppression of leaf senescence resulted in a four- to fivefold- 
higher level of biomass in the transgenic versus the nontransformed plants. 
This result suggests that it may be possible to get irrigated crops to grow 
normally with only one-third the amount of water that is usually used. 

Fruit Ripening and Flower Wilting 

A major problem in fruit marketing is premature ripening and softening 
during transport. These changes are part of the natural aging (senescence) 
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FIGURE 19.34 Overview of how overexpressing cytokinin protects plants against 
drought. P SARK is a senescence-induced promoter. The IPT gene encodes isopente- 
nyltransferase, which catalyzes the rate-limiting step in cytokinin biosynthesis. 


process of the fruit. Some of the genes that are induced during ripening 
encode the enzymes cellulase and polygalacturonase. It has been postu¬ 
lated that by interfering with the expression of one or more of these genes, 
the ripening process might be delayed. This interference could be achieved 
by creating transgenic plants with antisense or sense (cosuppression) RNA- 
producing versions of these genes. In fact, when an antisense RNA- 
producing gene for polygalacturonase was introduced into tomato plants, 
a $1.3 billion-a-year crop in the United States, both polygalacturonase 
mRNA and enzymatic activity were reduced by 90%. The lowering of poly¬ 
galacturonase production inhibited fruit ripening in tomatoes, permitting 
the tomatoes to ripen on the vine instead of being harvested while they 
were still green. These tomatoes were claimed to have a long shelf life 
while retaining the flavor of the tomato. This genetically engineered tomato 
is known as the Flavr Savr (pronounced "flavor saver") tomato. On 18 May 
1994, the U.S. Food and Drug Administration ruled that the Flavr Savr 
tomato was as safe for human consumption as tomatoes that were bred by 
conventional means, and because Flavr Savr tomatoes were essentially the 
same as other tomatoes, special labeling was not required. 

The plant growth regulator ethylene induces the expression of a number 
of genes that are involved in fruit ripening and senescence and in flower 
wilting. It is synthesized from methionine by way of the intermediate com¬ 
pounds S-adenosylmethionine and 1-aminocyclopropane-l-carboxylic acid 
(ACC) (Fig. 19.35). Treatment of plants with chemical compounds that block 
ethylene production delays fruit ripening, senescence, and flower wilting. 
Thus, premature fruit ripening and flower wilting might be prevented by 
inhibiting the synthesis of ethylene. This can be achieved by blocking sev¬ 
eral different steps in the ethylene biosynthesis pathway (Fig. 19.35). For 
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FIGURE 19.35 Centrality of ACC in the synthesis of ethylene. M-ACC, 
l-(malonylamino)-cyclopropane-l-carboxylic acid; G-ACC, l-(Y-L-glutamylamino) 
cyclopropane-l-carboxylic acid. 


example, transgenic plants that have been engineered to contain antisense 
RNA versions of S-adenosylmethionine synthetase, ACC synthase, or ACC 
oxidase have much lower than normal levels of ethylene. Similarly, the 
amount of ACC, and hence the amount of ethylene, that can be synthesized 
may be decreased by increasing the activity of either malonyl ACC trans¬ 
ferase or glutamyl ACC transferase, thereby converting ACC into one of 
these dead-end storage compounds (Fig. 19.35), or by transforming plants 


FIGURE 19.36 Inhibition of ethylene biosynthesis by genetic manipulation. Normally, 
ACC is synthesized from S-adenosylmethionine by the enzyme ACC synthase, and 
then ACC oxidase converts ACC to ethylene. Ethylene synthesis may be inhibited 
in transgenic plants either by an antisense mRNA version of ACC synthase; by ACC 
oxidase, which inhibits the synthesis of these enzymes; or by the enzyme ACC 
deaminase, which competes with ACC oxidase for the available ACC, producing 
ammonia and a-ketobutyrate rather than ethylene. 
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with the bacterial enzyme ACC deaminase, which converts ACC to 
a-ketobutyrate (Fig. 19.36). Each of these genetic manipulations results in a 
decreased level of ethylene and thereby extends the storage life of fruit and 
flowers. 

By screening a large number of soil bacteria for the ability to utilize 
ACC as a sole source of nitrogen, strains that degrade ACC were identified. 
From one of these strains, the gene for the enzyme ACC deaminase was 
isolated based on the ability of transformed E. coli strains that expressed 
this gene to grow on minimal medium containing ACC. This gene was 
subcloned, put under the control of the 35S promoter from cauliflower 
mosaic virus, and expressed in tomato plants. The transgenic plants syn¬ 
thesized a much lower level of ethylene than did normal plants, and the 
fruit of the transgenic plants had a significantly longer shelf life. These 
bioengineered changes result in fewer losses due to spoilage because of the 
much lower levels of ethylene. Similar results have been observed with 
transgenic cantaloupes that have lowered ethylene levels. This strategy to 
delay fruit ripening is effective with a range of different fruits. 

In another experiment, researchers isolated an 850-base-pair (bp) DNA 
fragment that corresponded to a portion of the cDNA for ACC oxidase 
from the tropical plant torenia (Torenia fournieri Lind.)—the complete cDNA 
is about 1 kilobase pair. The cDNA fragment was cloned in both the sense 
and antisense orientations into a binary vector and then used to transform 
torenia. In wild-type plants, the flowers lasted an average of 2.0 days before 
they wilted; transgenic plants with the ACC oxidase cDNA fragment in the 
antisense orientation lasted 2.7 days—a small but significant difference— 
and transgenic plants with the ACC oxidase cDNA fragment in the sense 
orientation lasted around 4.4 days. With both types of transgenic plants, 
not only did the flowers last longer, but also more flowers bloomed per 
stem than with the wild-type plant, yielding a more aesthetically pleasing 
plant. 

In the future, a large number of plants will be engineered to have lower 
ethylene levels, primarily so that fruit ripening and flower wilting, or 
abscission, are inhibited. Fruits that are likely to be the targets of such 
genetic manipulation include melons, pineapples, and bananas; targeted 
flowers might include roses, carnations, tulips, chrysanthemums, and 
orchids. 


SUMMARY 


B y using a variety of techniques (see chapter 18), it has 
become relatively straightforward to transform plants 
with foreign genes. Plants have been engineered to be resis¬ 
tant to a range of environmental stresses, including insects, 
viruses, herbicides, pathogens, and oxidative and salt stress. 

Several different strategies have been used to confer resis¬ 
tance against insect predators, including introducing a gene 
encoding an insecticidal protoxin produced by one of several 
subspecies of B. thuringiensis; plant proteins, such as a-amylase 
inhibitors, lectins, or protease inhibitors; or other bacterial 
insecticidal proteins. 

Transgenic plants expressing the gene for a viral coat pro¬ 
tein are protected against infection by that virus. They may 
also be protected against damage from infective viruses by 
expression of other genes, such as an E. coli gene for RNase III, 


pokeweed antiviral proteins, and single-chain antibodies 
directed against various viruses. 

To permit crop plants to proliferate in the presence of 
weeds, many plants have been engineered to be resistant to 
one or more "environmentally friendly" herbicides. This 
approach has become enormously successful and is the basis 
for the largest number of transgenic plants that are used in the 
field. 

To develop plants resistant to fungal and bacterial patho¬ 
gens, several approaches have been tested. For example, 
transgenic plants have been engineered to express high levels 
of chitinase or lysozyme or to overproduce PR proteins. 

Different foreign proteins protect plants against different 
stresses. Superoxide dismutase and oxidized glutathione pro¬ 
tect plants against oxidative stress, betaine overproduction 
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and compartmentalization into the vacuole are effective 
against salt stress, and lowering plant ethylene levels has an 
impact on many different types of stress. 

In sum, numerous transgenic plants with altered properties 
and contents have been successfully produced and tested in 


the laboratory and in some cases in the field. More and more 
genetically engineered plants have entered the marketplace, 
and it is likely that transgenic plants will become an integral 
part of agricultural and horticultural practice. 
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REVIEW QUESTIONS 

1. A local crop is being ravaged by a nonenveloped virus with 
a single-stranded RNA genome (8,000 nucleotides long). The 
virus and its RNA can be readily isolated. In addition, you 
have antibodies against all four of the proteins encoded within 
the viral genome. Describe a strategy that you could use to 
protect the crop against this viral infection and prevent subse¬ 
quent damage. 

2. Suggest several different strategies for developing insect- 
resistant plants. 

3. How can protease inhibitors, a-amylase inhibitor, choles¬ 
terol oxidase, Vip proteins, and ricin each protect a plant 
against damage from insect predation? 

4. How can RNAi be used to protect plants against damage 
from insect predation? 

5. Suggest a couple of strategies for simultaneously protecting 
a plant against damage from several different viruses. 

6. How can RNAi be used to protect plants against damage 
from plant viruses? 

7. What general strategies can be employed in genetically 
engineering plants to be resistant to herbicides? 

8. Suggest two different strategies for engineering plants that 
are resistant to the herbicide glyphosate. Why is this impor¬ 
tant? 


9. How can crop plants be engineered to be resistant to the 
herbicide dicamba? 

10. How can plants be engineered to resist damage from 
pathogenic soil fungi? 

11. How can plants be genetically engineered to be resistant 
to pathogenic bacteria? 

12. How can single-chain Fv antibodies be engineered to pro¬ 
tect plants against fungal pathogens? 

13. How can a plant's systematic acquired resistance response 
be engineered to confer resistance to a broad spectrum of both 
fungal and bacterial pathogens? 

14. What is the effect of increasing the level of oxidized gluta¬ 
thione within a plant? How would you genetically manipulate 
a plant to do this? 

15. Suggest several strategies that could be used to engineer 
plants that are resistant to growth inhibition by salt and by 
drought. 

16. You have been asked by an avocado grower to find a way 
to genetically engineer his crop to prevent it from ripening 
during shipping. What experimental approaches would you 
consider? 
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Modification of Plant Nutritional Content 

O ver the years, agronomists and plant breeders have been extremely 
successful in optimizing the useful properties (e.g., protein or oil 
content) and increasing the productivity (yield) of a large number 
of crop plants. However, traditional breeding approaches to crop improve¬ 
ment are both difficult and slow, and they are intrinsically limited by the 
existing genetic content of cross-breeding strains. Conversely, the use of 
genetic engineering techniques allows scientists both to dramatically 
speed up the process of developing plants with improved characteristics 
and to introduce traits that would otherwise be impossible to develop by 
traditional techniques. For example, on a laboratory scale, genetic engi¬ 
neering has been used to improve (1) the nutritional quality of several 
different plants, including com (maize) and pea, by modification of the 
amino acid content of some of their seed storage proteins; (2) the fatty acid 
compositions of both edible and nonedible oil-producing crops; and (3) 
the taste of fruits and vegetables by the introduction of monellin, a sweet¬ 
tasting protein. 


Amino Acids 

Seed storage proteins, which are used as sources of both carbon and 
nitrogen during seed germination, contain a limited number of amino 
acids, which are organized into repeating peptide units. Often, the nutri¬ 
tional value of these proteins is deficient because they lack one or more of 
the amino acids, usually lysine or methionine, that are essential for human 
health. The amino acid composition of the seed storage proteins can be 
altered to a limited extent by breeding programs, but genetic engineering 
strategies can also be used. 

The bulk of the lupine, a grain legume, that is produced annually in 
Australia (>800,000 tons) is used to feed cattle, pigs, and chickens. 
Unfortunately, like most other grain legumes, lupine is deficient in methi¬ 
onine and cysteine. Therefore, lupine feed is supplemented with methio¬ 
nine. To provide animals with more nutritious feed without methionine 
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FIGURE 20.1 Schematic representation of the T-DNA construct used to transform 
lupines to increase the methionine content. The arrows indicate the direction of 
transcription. LB, left border; P 35S , the 35S promoter from cauliflower mosaic virus; 
uidA, a bacterial gene encoding ((-glucuronidase; TT, transcription termination 
region, including a polyadenylation site; ssa, the sunflower seed albumin gene; P„ K , 
the promoter from the pea vicilin gene; bar, a bacterial gene that confers resistance 
to the herbicide phosphinothricin, which is used as a selectable marker; RB, right 
border. 


supplementation, lupines were engineered to express sunflower seed 
albumin, which is both stable in the rumen and unusually rich in the 
sulfur-containing amino acids methionine and cysteine (Fig. 20.1). Sunflower 
seed albumin escapes microbial breakdown in the rumen and is therefore 
available for digestion and absorption in the lower gastrointestinal tract. 
The transgenic lupine plants that expressed sunflower seed albumin were 
used as an animal feed, and, as expected, rats that received transgenic 
lupines as their sole nitrogen source made significantly greater weight 
gains than did rats fed nontransgenic lupines, comparable to what would 
be expected if a nontransgenic lupine diet had been supplemented with 
pure methionine. 


FIGURE 20.2 Schematic representation of the biosynthetic pathway for amino acids 
derived from aspartic acid. Not all of the steps and intermediates are shown. 
Feedback inhibition is shown by the dashed lines. DHDPS, dihydrodipicolinic acid 
synthase; AK, aspartokinase. 
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FIGURE 20.3 Schematic representation of a Ti plasmid vector used to transform soy¬ 
bean and canola with genes that modify their lysine contents. P v5 ., the promoter 
from the bean (3-phaseolin gene; T v3 ., the transcription termination region from the 
bean (3-phaseolin gene; cts, the coding region for the chloroplast transit peptide 
from the small subunit of ribulose bisphosphate carboxylase; dapA, a gene from 
Corynebacterium that encodes a lysine-insensitive DHDPS; lysCM4, a mutant ver¬ 
sion of the E. coli lysC gene that encodes a lysine-insensitive AK. The left and right 
borders of the T-DNA are indicated by LB and RB, respectively. 


One novel way to increase the lysine content of seeds is to increase the 
production of lysine in transgenic plants by deregulating the lysine biosyn¬ 
thetic pathway. The amino acids lysine, threonine, methionine, and isoleu¬ 
cine are all derived from aspartic acid (Fig. 20.2). The first step in the 
conversion of aspartic acid to lysine is phosphorylation of the aspartic acid 
by aspartokinase (AK) to produce (3-aspartyl phosphate. The condensation 
of aspartic (3-semialdehyde with pyruvic acid to form 2,3-dihydrodipico- 
linic acid, which is catalyzed by dihydrodipicolinic acid synthase (DF1DPS), 
is the first reaction in the pathway that is committed to lysine biosynthesis. 
Both AK and DF1DPS are feedback inhibited by lysine. Thus, to overpro¬ 
duce lysine, it is necessary to abolish the feedback inhibition of these two 
enzymes. This was accomplished by cloning naturally lysine feedback- 
insensitive genes for DHDPS and AK from Corynebacterium and Escherichia 
coli, respectively; fusing each of these genes to a chloroplast transit peptide 
(to ensure that the two proteins are localized in seed plastids); placing each 
gene under the control of a seed-specific promoter; and then introducing 
the two genes on a Ti plasmid binary vector into canola and soybean plants 
(Fig. 20.3). Transgenic canola and soybean plants had more than a 100-fold 
increase in the free lysine in their seeds, with an overall doubling of the 
total seed lysine content in canola and a fivefold increase in the total lysine 
content in soybean. 

Currently when com is used as an animal feed, it must be supple¬ 
mented with soybean meal, purified lysine, or both. In the future, it may be 
possible to replace the use of expensive lysine with inexpensive transgenic 
soybean meal from soybean plants that overproduce lysine. Moreover, it 
may eventually be possible, by using the approach that has been successful 
with soybean, to engineer corn to overproduce lysine. High-lysine com 
would be more nutritious for both animals and humans. 

Lipids 

It has been estimated that annual global plant oil production will be worth 
around $70 billion by 2010. More than 90% of this production is for human 
consumption in margarines, shortenings, salad oils, and frying oils. 
Together, soybean, palm, canola (rapeseed), and sunflower account for 
approximately 80% of worldwide plant oil production. For the most part, 
these oils consist of palmitic, stearic, oleic, linoleic, and linolenic acids 
(Table 20.1). In addition, some vegetable oils contain fatty acids with con¬ 
jugated double bonds. This is in contrast to the more usual case in which 
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TABLE 20.1 Some important plant fatty acids 


Common name 

Abbreviation 

Caprylic acid 

f— 8:0 

Capric acid 

f— 10:0 

Laurie acid 

f— 12:0 

Myristic acid 

f— 14:0 

Palmitic acid 

f— 16:0 

Stearic acid 

f— 18:0 

Petroselinic acid 

A6C 1S:1 

Oleic acid 

A9C 18:1 

Linoleic acid 

A9,12C 18:2 

Linolenic acid 

A9,12,15C 18:3 

Ricinoleic acid 

120HA9C 18;1 

Erucic acid 

A13C 22:1 


The first number after the C denotes the number of carbon atoms; 
the number after the colon is the degree of unsaturation, i.e., the number 
of C=C bonds; a A followed by a number indicates the position of the 
first carbon atom that is involved in the C=C bond; a number followed 
by an OH indicates the position on the chain of a hydroxyl group. All 
C=C bonds are cis. The numbering system for fatty acids begins with the 
carboxyl group as 1. 


the typical polyunsaturated fatty acid of plant seed oils contains double 
bonds that are separated by methylene (—CH 2 —) groups. The presence of 
conjugated double bonds increases the rate of oxidation compared with 
polyunsaturated fatty acids, with methylene-interrupted double bonds 
making them well suited for use as drying agents in paints and inks 
because they require less oxygen for the polymerization reactions that 
occur during the drying process. In an effort to eat healthier foods, con¬ 
sumers have become concerned about the nutritional content of various 
edible oils. As can be seen in Table 20.2, the fatty acid content of edible oils 
can vary dramatically. Moreover, it is deemed desirable to have as low a 
level as possible of saturated fats, a high level of oleic acid (which lowers 
the undesirable low-density lipoproteins, or LDLs, without affecting the 
desirable high-density lipoproteins, or HDLs), and as high a level as pos¬ 
sible of omega-3 fatty acids. 


TABLE 20.2 Dietary fats present in various oils 


Oil 

Saturated 
fat (%) 

Polyunsaturated 
fat (%) 

Monounsaturated 
fat (oleic acid) {%) 

Omega-3 fatty 
acids (%) 

Canola 

7 

21 

61 

11 

Safflower 

10 

76 

14 

Trace 

Sunflower 

12 

71 

16 

1 

Corn 

13 

57 

29 

1 

Olive 

15 

9 

75 

1 

Soybean 

15 

54 

23 

8 

Peanut 

19 

33 

48 

Trace 

Cottonseed 

27 

54 

19 

Trace 

Palm 

51 

10 

39 

Trace 

Butterfat 

66 

3 

28 

1 

Coconut 

91 

2 

7 

None 
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Canola oil is considered to be one of the healthiest and most nutritious 
edible oils. In 1989, canola oil received an award from the American Health 
Foundation as the Health Product of the Year. Canola is derived from rape- 
seed (which may contain up to 40% erucic acid); however, following more 
than 40 years of conventional breeding, by definition, canola (for Canada oil 
low erucic odd) contains less than 1% erucic acid (several reports have 
associated erucic acid with various cancers and heart ailments) and a 
maximum of 18 gmol of all glucosinolates (secondary metabolites respon¬ 
sible for the bitter or sharp taste of many common foods, such as mustard 
and horseradish) per gram in whole seed. Notwithstanding the enormous 
success that has been achieved in developing canola by conventional 
breeding, it is currently possible, by genetic engineering, to rapidly modify 
a number of genetic traits to continue to improve canola. For example, 
approximately 90% of the Canadian canola crop has been engineered to be 
herbicide resistant. In addition, some varieties of canola have been engi¬ 
neered to be insect resistant, and it is currently feasible to change the 
degree of unsaturation, i.e., the number of carbon-carbon double bonds, 
and to modify the chain lengths of fatty acids in canola plants by genetic 
manipulation. A number of transgenic varieties of canola, each producing 
a different modified oil, have been created (Table 20.3). Each transgenic 
variety contains one additional gene. For example, the production of short¬ 
ening, margarine, and confectionery goods requires large amounts of 
stearate. One variety of transgenic canola contains an antisense copy of a 
Bmssica stearate desaturase gene, which inhibits the expression of the 
normal canola gene and leads to the accumulation of stearic acid rather 
than the desaturation of stearic acid to oleic acid. Progress on the produc¬ 
tion of transgenic canola varieties with modified seed oil properties has 
been both rapid and impressive. 

The omega-3 and omega-6 fatty acids are precursors for many prosta¬ 
glandins and are therefore directly responsible for regulating a number of 
important human metabolic functions. Until now, the major source for 
these important fatty acids has been marine and fish oils. However, for a 
variety of reasons, including the fact that global fish resources have 
declined dramatically in recent years, researchers have turned their atten¬ 
tion to genetically engineering plants to produce safe, affordable, and 
renewable alternatives to the traditional sources. Given the fact that plants 
can synthesize linoleic and a-linolenic acid, C-18 precursors of the long- 
chain omega-3 and omega-6 fatty acids, it may be possible to engineer 


TABLE 20.3 Transgenic canola varieties with modified seed lipid contents 


Seed product 

Commercial use(s) 

40% Stearic acid 

Margarine, cocoa butter 

40% Laurie acid 

Detergents 

60% Laurie acid 

Detergents 

80% Oleic acid 

Food, lubricants, inks 

Petroselinic 

Polymers, detergents 

"Jojoba" wax 

Cosmetics, lubricants 

40% Myristate 

Detergents, soaps, personal care items 

90% Erucic acid 

Polymers, cosmetics, inks, pharmaceuticals 

Ricinoleic acid 

Lubricants, plasticizers, cosmetics, pharmaceuticals 


Adapted from Murphy, Trends Biotechnol. 14:206-213,1996. 
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plants to transform linoleic and a-linolenic acids into long-chain omega-3 
and omega-6 fatty acids. In fact, when Arabidopsis thaliana plants were 
transformed (in three separates stages) with three additional genes, the 
linoleic and a-linolenic acids were converted to arachadonic and eicosap- 
entaenoic acids, respectively (Fig. 20.4). The three introduced genes 
included a A9-specific elongase from the marine microalga Isochrysis gal- 
bana, a A8-desaturase from the protist Euglena gracilis, and a A5-desaturase 
from the fungus Mortierella alpina. Despite the fact that the engineered 
plants contained higher than normal levels of arachidonic acid and eicosa- 
pentaenoic acid, i.e., 7% and 3%, respectively, in their leaf tissue, plant 
growth and development were normal. Thus, it is possible to engineer 
pathways in plants for the production of long-chain polyunsaturated fatty 
acids that are vital for human health. It is now necessary to engineer this 
pathway so that the three introduced genes are expressed in a seed-specific 
manner in an oilseed crop, such as soybean or canola. 


Vitamins 

Vitamin E. A substantial body of evidence indicates that dietary supple¬ 
mentation with the lipid-soluble antioxidant vitamin E (400 international 
units, or approximately 250 mg, of [R,R,R]-a-tocopherol daily) results in a 
decreased risk for cardiovascular disease and cancer, assists in immune 
function, and prevents or slows a number of degenerative diseases in 
humans. The oils that are extracted from seeds have a relatively high level 


FIGURE 20.4 Overview of the conversion of linoleic acid to arachidonic acid and 
linolenic acid to eicosapentaenoic acid in transgenic A. thaliana expressing three 
foreign genes (all under the control of the cauliflower mosaic virus 35S promoter). 
IgASEl, A9-specific fatty acid elongase from I. galbana; EuA8, A8-desaturase from E. 
gracilis; MortA5, A5-desaturase from M. alpina. 

Linoleic acid (C 18:2 ) to arachidonic acid (C 2 o : 3 ) Linolenic acid (C 18:2 ) to eicosapentaenoic acid (C 20: 4 ) 
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FIGURE 20.5 Conversion of '/-tocopherol to a-tocopherol by the enzyme y-tocopherol 
methyltransferase. 


of total tocopherols, but in most cases, only a very small fraction is 
a-tocopherol. Moreover, even in those instances where the fraction of 
a-tocopherol is large, e.g., sunflower oil, to obtain a sufficient amount of 
vitamin E to confer the above-mentioned health benefits, it would be nec¬ 
essary for an individual to consume nearly 400 g of oil daily. An alterna¬ 
tive approach to this problem is to engineer plants to produce a greater 
percentage of a-tocopherol than y-tocopherol by transforming plants with 
a gene encoding the enzyme y-tocopherol methyltransferase, which cata¬ 
lyzes the addition of a methyl group to y-tocopherol (Fig. 20.5). However, 
since a suitable y-tocopherol methyltransferase gene was not available, a 
strategy for its isolation had to be devised (Fig. 20.6). One of the genes in 
the a-tocopherol biosynthetic pathway had previously been cloned from 
the plant A. thaliana. However, at the time, it was not possible to isoate the 
other genes in the pathway. This gene encodes the enzyme p-hyd roxyphe¬ 
nyl pyruvate dioxygenase (HPPDase). A computer comparison of the 
sequence of the HPPDase gene with the complete DNA sequence of the 
cyanobacterium Synechocystis sp. strain PCC6803 revealed the presence of 
an open reading frame that encoded a protein of the expected size that was 
35% identical to the amino acid sequence of the Arabidopsis-e ncoded pro¬ 
tein. The Synechocystis putative HPPDase gene was found within a 10-gene 
operon thought to encode all of the enzymes involved in the synthesis of 
a-tocopherol. One of the other genes within this operon encoded a protein 
whose predicted amino acid sequence was similar to those of several 
known plant A-(24)-sterol-C-methyltransferases. This enzyme has 
S-adenosylmethionine-binding domains, and S-adenosylmethionine 
donates the methyl group to y-tocopherol during the conversion to 
a-tocopherol. In addition, the gene had an N-terminal bacterial signal 
sequence. When the cyanobacterial gene was cloned and expressed in E. 
coli, the recombinant protein catalyzed the methylation of y-tocopherol to 
a-tocopherol. The DNA sequence of the Synechocystis y-tocopherol methyl- 
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FIGURE 20.6 Schematic overview of the isolation of the A. thaliana cDNA encoding 
the enzyme y-tocopherol methyltransferase (y-TMT) and its use to engineer A. 
thaliana to overproduce the enzyme. HPPDase, p-hydroxyphenylpyruvate dioxyge¬ 
nase. 


transferase gene was then compared with DNA sequences that were nor¬ 
mally expressed in Arabidopsis, and one Ambidopsis gene that showed 66% 
homology to the bacterial protein at the amino acid level was identified, 
cloned, and expressed in E. coli. This recombinant protein converted 
y-tocopherol to a-tocopherol. The Ambidopsis y-tocopherol methyltrans¬ 
ferase gene under the transcriptional control of a seed-specific promoter 
from carrots was then used to transform Ambidopsis plants. The a-tocopherol 
levels of these transgenic plants were significantly higher than they were 
in the nontransformed plants. Based on this model system, with the intro¬ 
duction of additional genetic manipulations (which increase the flux 
through the pathway), these results have been extended to corn embryos 
and soybean seeds, significantly increasing the nutritional value of the oils 
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produced by these plants. Despite the progress that has been made, trans¬ 
genic crops that produce higher levels of vitamin E are not yet sufficiently 
optimized for them to be commercialized. 

Vitamin A. Although rice (Oryza sativa) is the staple food of approximately 
half of the world's population, it is a poor source of several nutrients and 
vitamins, including vitamin A. About 124 million children worldwide are 
deficient in vitamin A; this deficiency leads to 1 million to 2 million deaths 
per year and is a leading cause of vision impairment, including night blind¬ 
ness and total blindness. One way to address the myriad of health prob¬ 
lems that result from vitamin A deficiency would be to engineer rice to 
produce the vitamin A precursor provitamin A ((3-carotene). Mammals syn¬ 
thesize vitamin A from (3-carotene, which is a common carotenoid pigment 
normally found in plant photosynthetic membranes. In the year 2000, an 
international group of scientists reported using Agrobacterium -mediated 
transformation to introduce the entire (3-carotene biosynthetic pathway 
into rice (Fig. 20.7). The phytoene synthase and phytoene desaturase genes 
were introduced on a construct that did not contain any selectable marker. 
The lycopene (3-cyclase gene was part of a separate construct that contained 
a selectable marker. The frequency of insertion of all three genes into the 
rice genome and their subsequent expression were quite high. Thus, the 
engineered rice produces (3-carotene, which, after ingestion, is converted to 
vitamin A. At the time, it was thought that this strategy would facilitate the 
eventual development of transgenic strains of rice that not only produced 
high levels of (3-carotene but also no longer contained any antibiotic resis¬ 
tance marker genes (removed as described in chapter 18). The transgenic 
rice that produces (3-carotene has a yellow or golden color and has been 
called "golden rice" by the scientists involved in its development. 
Unfortunately, the initial version of golden rice, now called golden rice 1, 
synthesized only 1.6 jig of (3-carotene per gram of rice, so that individuals 
would have had to consume around 3 kg of golden rice 1 each day to reach 
the recommended minimal daily requirement of vitamin A. However, in 
2005, scientists reported replacing the daffodil phytoene synthase gene 
(Fig. 20.7) with a similar gene from corn that produces an enzyme with a 
higher level of activity, resulting in a variety called golden rice 2 that pro¬ 
duces a 23-fold-higher level of (3-carotene than golden rice 1. Moreover, in 
contrast to vitamin A, there are no harmful effects when individuals con¬ 
sume excess amounts of dietary (3-carotene. The research that culminated 
in the development of golden rice was funded by several nonprofit agen¬ 
cies, and the companies that hold the patents on the technologies that made 
this work possible have agreed to forgo their usual royalties. Therefore, the 
rice is expected to be freely available to farmers in the world's poorest 
countries. 

One of the problems for the more widespread use of golden rice 2 in 
Asia has been a general mistrust of genetically modified foods by some 
consumers. In addition, both golden rice 1 and golden rice 2 were produced 
from subspecies japonica cultivars of rice, which are popular with scientists 
but do not do well in the field in Asia. To remedy this situation, researchers 
are currently introducing the traits from golden rice 1 and golden rice 2 into 
the more popular subspecies indica varieties of rice by traditional genetic 
crossing of the indica strains with the engineered japonica cultivars. As of 
mid-2008, the first field trial of an indica variety of golden rice was taking 
place at the International Rice Research Institute in the Philippines. It is 
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FIGURE 20.7 Biosynthesis of |3-carotene 
in rice and vitamin A in humans. The 
daffodil phytoene synthase gene (psy) 
was controlled by a promoter from the 
rice seed storage protein glutelin. The 
phytoene desaturase (crt) gene was 
from the bacterium Erzvinia uredovora 
and was controlled by the 35S pro¬ 
moter. The lycopene (3-cyclase (Icy) 
gene originated from daffodil and was 
controlled by the rice glutelin promoter. 
All three genes were fused to transit 
peptides so that the proteins that they 
encoded would be transported into the 
plastid. GGPP, geranylgeranyl pyro¬ 
phosphate. 
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hoped, following successful field trials, that golden rice can finally start 
fulfilling its promise. 

Folate. Tetrahydrofolate, or vitamin B9, is an essential micronutrient that 
is a necessary component of human diets. While the recommended daily 
intake of this vitamin is only 400 pg, the diets of many individuals in devel- 
opingcountriesarechronicallydeficientin tetrahydrofolate. Tetrahydrofolate 
deficiency can result in severe birth defects, anemia, and neural tube 
defects. For approximately 3 billion people, most of them in developing 
countries, rice provides around 80% of the daily caloric intake; however, 
rice is a poor source of tetrahydrofolate, as well as many other micronutri- 


FIGURE 20.8 (A) Structure of tetrahydrofolate. (B) Major steps in the pathway 
leading to the synthesis of tetrahydrofolate and tetrahydrofolate polyglutamates in 
rice. A portion of the molecule is synthesized in the chloroplast and a portion in the 
cytosol, and the molecule is assembled in the mitochondria. Two enzymatic steps 
(marked by asterisks) that represent rate-limiting steps in the biosynthetic pathway 
were engineered by the addition of foreign genes to increase the flux through the 
pathway. 
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FIGURE 20.9 The T-DNA construct used to transform rice and produce ferritin in the 
seeds. The arrows indicate the direction of transcription. LB, left border; TT, tran¬ 
scription termination region; bar, the bacterial phosphinothricin acetyltransferase 
gene; P 35S , 35S promoter from cauliflower mosaic virus; P (uB , the promoter from the 
rice seed storage protein glutelin;/er, the soybean ferritin-encoding cDNA; RB, right 
border. 


ents. To increase the nutritive value of rice, the pathway for the production 
of tetrahydrofolate and tetrahydrofolate polyglutamates may be geneti¬ 
cally manipulated. 

Folates are tripartite molecules; that is, they consist of three separate 
parts, including pteridine, p-aminobenzoic acid, and one or more gluta¬ 
mate molecules (Fig. 20.8A). In plants, the pteridine precursors are synthe¬ 
sized from GTP in the cytosol, p-aminobenzoic acid is synthesized from 
chorismate in the chloroplasts, and then both the pteridine precursor and 
p-aminobenzoic acid are imported into the mitochondria, where the final 
molecule is assembled and modified (Fig. 20.8B). To express high levels of 
folate in rice, the first genes in each of the chloroplastic and cytosolic path¬ 
ways, starting from chorismate and GTP, respectively, were introduced 
from A. thaliana so that the two enzymes from these genes were overex¬ 
pressed (Fig. 20.8B). In one transgenic line, the level of tetrahydrofolate was 
approximately 100 times the level found in nontransgenic rice plants, 
which was more than sufficient to meet the recommended daily dietary 
amount of this vitamin. In another study by a different research group, 
similar genetic manipulations led to the enhancement of the level of the 
vitamin in transgenic tomatoes. 

Iron 

The World Health Organization has estimated that iron deficiency affects 
approximately 30% of the world's population and is especially problematic 
where vegetable-based diets are the primary food source. Although a 
number of crops are rich in iron, absorption of this iron is often prevented 
by the phytic acid that is present in many of the plants. As a first step toward 
developing food crops with sufficient levels of iron to prevent iron defi¬ 
ciency anemia, scientists engineered rice plants to express the soybean pro¬ 
tein ferritin (Fig. 20.9). Ferritin is an iron storage protein that is found in 
animals, plants, and bacteria and carries up to 4,500 iron atoms in its central 
cavity, which is formed from the interaction of 24 monomeric ferritin sub¬ 
units. The soybean ferritin complementary DNA (cDNA) was cloned into a 
binary vector under the transcriptional control of the rice seed storage pro¬ 
tein glutelin promoter (Fig. 20.9), and the entire construct was introduced 
into plants by electroporation. In this case, soybean ferritin was expressed in 
the seeds of rice plants and not in any other tissues. As a result, the iron 
content of rice seeds per gram (dry weight) of tissue was increased to 
approximately 2.5 times the original value while the iron content of leaves, 
stems, and roots did not change to any significant extent. Based on a typical 
daily adult portion of 150 grams of rice, the transgenic ferritin rice provides 
approximately 30 to 50% of the recommended daily adult requirement of 
iron. 
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While overproducing ferritin is an important first step in engineering 
plants to provide additional dietary iron, it is also necessary to ensure that 
the additional iron can be absorbed efficiently To do this, rice plants were 
transformed with three different genes. First, to increase the iron content, 
the rice plants were transformed with a ferritin-encoding cDNA from green 
beans (Phaseolus vulgaris). Then, to improve the bioavailability of the intro¬ 
duced iron, plants were transformed separately with cDNAs encoding 
phytase (phytate is an inhibitor of iron absorption that can be removed by 
microbial phytase) and metallothionein (a family of cysteine-rich, low- 
molecular-weight proteins that bind metals through the thiol groups of 
cysteine residues) from the fungus Aspergillus fumigatus and from rice, 
respectively. To limit the expression of these genes to rice grains, their 
expression was controlled by an endosperm-specific promoter (Fig. 20.10). 
Each of the three different transgenic plants that were engineered produced 
the protein encoded by its transgene. Before these strains are crossed to 
generate a plant that can express all three of these transgenes, a more heat- 
stable version of the phytase is required, since cooking the rice resulted in 
the inactivation of around 90% of the phytase. Nevertheless, because 
obtaining sufficient dietary iron is so important to such a large number of 
people in the world, these preliminary results are exciting and promise to 
eventually have an enormous impact on the health of millions of people. 

Phosphorus 

Most of the phosphorus in cereals and legumes is found in the form of 
phytate (phytic acid, or inositol hexaphosphate). Phytate cannot be digested 
by nonruminant animals (or by humans), so the unabsorbed phytate passes 
through the gastrointestinal tract and elevates the amount of phosphorus 


FIGURE 20.10 Genetic constructs used to transform rice and produce ferritin (A), 
metallothionein (B), and phytase (C) in seeds. The arrows indicate the direction of 
transcription. To ensure that phytase would be secreted into the apoplast, phyA in 
construct C was fused to DNA encoding the signal peptide for a barley (3-glucanase 
gene (not shown). Constructs A and B were introduced into rice plants by A. tume- 
faciens-mediated transformation, and construct C was transferred by microprojec- 
tile-mediated transformation. LB, left border; TT, transcription termination region; 
hpt, the E. coli hygromycin phosphotransferase gene; P 35S , 35S promoter from cauli¬ 
flower mosaic virus; P gluB , the promoter from the rice seed storage protein glutelin; 
fer, a cDNA encoding green bean ferritin; mth, a cDNA encoding rice metallothio¬ 
nein; phyA, an A. fumigatus cDNA encoding phytase; RB, right border. 
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in manure. The excretion of high levels of phosphorus can sometimes lead 
to environmental problems, such as eutrophication (excessive plant and 
algal growth and decay). To increase the nutritional value of crops such as 
soybean and com to nonruminant animals, such as poultry, swine, and fish, 
animal feed is generally supplemented either with phosphorus derived 
from rock phosphate or with phytase, which degrades phytate after it is 
ingested. Unfortunately, such supplementation is expensive, adding sig¬ 
nificantly to the cost of producing these animals. However, beginning in 
the early 1990s, several low-phytate mutants were isolated in com, barley, 
rice, wheat, and soybean. In these mutants, an enzyme involved in one of 
the many steps in the conversion of glucose-6-phosphate to phytate (Fig. 
20.11) was typically altered so that the amount of phytate in the seed was 
often reduced by 50 to 90%. Moreover, the reduction in phytate was gener¬ 
ally accompanied by an increase in inorganic phosphate that maintained 
the total seed phosphorus level. The problem with this strategy is that the 
systemic reduction of phytate often has negative effects on the whole plant, 
resulting in decreases in seed germination, emergence, stress tolerance, and 
seed filling. Therefore, as an alternative to limiting the synthesis of phytate, 
one group of researchers constructed mutants of corn and soybean that 
were defective in the transport of phytate to seeds by silencing the ATP- 
binding cassette (ABC) transporter in a seed-specific manner (Fig. 20.11). 
The seeds of these plants had approximately 10 to 20% of the normal level 
of phytate with a commensurate increase in the level of inorganic phos¬ 
phate. Moreover, the phytate levels in the rest of the plant were essentially 
unchanged. While these transgenic plants still need to be tested for their 
efficacy as animal feed, this approach promises to be important from both 
agricultural and environmental perspectives. 


Modification of Food Plant Taste and Appearance 

Preventing Discoloration 

The postharvest discoloration of fruits and vegetables is a considerable 
problem for the food industry. A lack of acceptance of discolored foods by 

FIGURE 20.11 Schematic representation of the biosynthesis of phytic acid, by six 
separate enzyme-catalyzed reactions, from glucose-6-phosphate and the subse¬ 
quent storage of the phytic acid in protein storage vacuoles. 
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consumers has been dealt with by the food industry through the use of 
additives in a wide range of foods. However, the safety of some of these 
food additives, in particular sulfites, has been questioned. 

The enzymes that are thought to be responsible for the initial step in 
the discoloration of fruits and vegetables, the oxidation of monophenols 
and o-diphenols to o-quinones, are polyphenol oxidases. These nucleus- 
encoded enzymes with a molecular weight of around 59,000 are localized 
in chloroplast and mitochondrial membranes. 

The contention that inhibition of the enzyme polyphenol oxidase 
would decrease the extent of discoloration has been tested with transgenic 
potatoes carrying a number of different polyphenol oxidase cDNA con¬ 
structs. Vectors were constructed with either the full-length or partial 
potato polyphenol oxidase cDNA in either the sense or antisense orienta¬ 
tion under the control of the cauliflower mosaic virus 35S promoter, the 
granule-bound starch synthase promoter, or the patatin type I promoter 
(Fig. 20.12). The last two promoters are specific for the potato tuber. The 
two commercial varieties of potato that were transformed with these con¬ 
structs are considered to have a good level of intrinsic resistance to black 
spot (enzymatic discoloration), so any increase in black spot resistance by 
genetic manipulation would be greater than what could be attained by 
traditional breeding techniques. Transgenic plants with the polyphenol 
oxidase cDNA constructs were deliberately bruised, and then the extent of 
black spot damage was assessed. Most of the transgenic potato plants with 
an antisense version of the polyphenol oxidase gene under the control of 
either the cauliflower mosaic virus 35S promoter or the granule-bound 
starch synthase promoter were significantly more resistant to black spot 
than the nontransformed potatoes. The patatin promoter, which may not be 
fully active in potato tubers, did not prevent polyphenol oxidase accumula¬ 
tion. The sense constructs all synthesized increased amounts of polyphenol 
oxidase and showed larger amounts of black spot than nontransformed 
control plants. It is hoped that these and similar antisense constructs will 

FIGURE 20.12 Sense and antisense polyphenol oxidase gene constructs. Transcription 
of either the sense or antisense cDNA is separately under the control of the cauli¬ 
flower mosaic virus 35S promoter (P 35S ), the granule-bound starch synthase pro¬ 
moter (P GBSS ), or the patatin type I promoter (P patalin ) with the nopaline synthase 
transcription terminator region (fNOS). RB and LB, right and left borders of the 
T-DNA, respectively. Adapted from Bachem et al., Bio/Technology 12:1101-1105, 
1994. 
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reduce enzymatic discoloration in a wide range of commercially important 
plants. 

Sweetness 

Even though a fruit or vegetable may have high nutritional value, if it is not 
tasty, humans usually will not eat it. Although palatability of food can be 
achieved by adding salt, sugar, flavors, or other ingredients during prepa¬ 
ration, it would be advantageous to the food industry if certain foods could 
be made intrinsically more appetizing. 

Monellin, a protein that is found in the fruit of an African plant with 
the unlikely name of serendipity (Dioscoreophyllum cuviviinsii Diels), is 
approximately 3,000 times sweeter than sucrose on a weight basis. This 
feature makes monellin a candidate as a sugar substitute, with the added 
bonus that, because it is a protein, it would not have the same metabolic 
impact as sugar. 

Monellin is a dimer with an A chain of 45 amino acid residues and a B 
chain of 50 residues; the chains are held together by weak noncovalent 
bonds. Unfortunately, the fact that monellin is composed of two separate 
polypeptide chains limits its usefulness as a sweetener because it is readily 
dissociated (denatured) and consequently loses its sweetness when it is 
either heated during cooking or exposed to acid (e.g., lemon juice or vin¬ 
egar). Also, the need to clone and express two separate genes in a coordi¬ 
nated manner complicates efforts to produce the protein in either transgenic 
plants or microorganisms. To circumvent this problem, a monellin gene 
that encodes both the A and B chains as a single peptide was chemically 
synthesized (Fig. 20.13). The fusion protein was produced in transgenic 
tomato and lettuce plants. Two different promoters were used to express 
the monellin fusion protein gene. In the experiment with tomatoes, expres¬ 
sion was directed by the tomato fruit-specific promoter E8, which is acti¬ 
vated at the onset of fruit ripening. The construct for the lettuce experiment 
was under the control of the 35S promoter from cauliflower mosaic virus. 
Each construct used the transcription termination-polyadenylation site 
from a Ti plasmid nopaline synthase gene. In each case, the synthetic 
monellin gene was introduced into plant cells by A. tumefaciens infection, 
using the Ti plasmid cointegrate vector system. Monellin was detected in 
ripe and partially ripe tomatoes and in lettuce leaves, but not in green 
tomatoes. The monellin level in tomatoes could also be elevated by a burst 
of the plant hormone ethylene. This strategy for sweetening plants without 
sugar or chemical additives would be applicable to a wide range of fruits 
and vegetables. 

In addition to monellin, several other sweet proteins have been 
reported (Table 20.4). The genes for some of these proteins have been iso¬ 
lated and characterized. Given the demand for low-calorie sweeteners, as 
well as the interest in healthy and natural food products, it is likely that 
many different plants will be genetically engineered to produce proteins to 
increase their sweetness. 

Fructans are naturally occurring polymers of fructose that are not usu¬ 
ally degraded in the human digestive tract. However, some beneficial bac¬ 
teria in the human intestinal tract can utilize fructans. Small fructans, i.e., up 
to five monosaccharide units, have a sweet taste and can be used as a nat¬ 
ural low-calorie sweetener instead of sucrose. Fructans are generally pro¬ 
duced from sucrose by fungal invertases on a large scale in an expensive 



FIGURE 20.13 Schematic representation 
of heat treatment of monellin (A) and 
modified monellin (B). 
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TABLE 20.4 Properties of some sweet-tasting proteins 


Protein 

Plant source 

Sweetness factor 
(weight basis) 

Molecular 
mass (kDa) 

No. of 
amino acids 

Active 

form 

Thaumatin 

Thnumatococcus daniellii Benth 

3,000 

22.2 

207 

Monomer 

Monellin 

Dioscoreophyllum cumminsii Diels 

3,000 

10.7 

45 (A chain), 50 (B chain) 

AB dimer 

Mabinlin 

Capparis masakai Levi 

too 

12.4 

33 (A chain), 72 (B chain) 

AB dimer 

Pentadin 

Pentadiplandra brazzeana Baillon 

500 

12.0 

ND 

ND 

Brazzein 

Pentadiplandra brazzeana Baillon 

2,000 

6.5 

54 

Monomer 

Curculin 

Curculigo latifolia 

550 

24.9 

114 

A 2 dimer 

Miraculin 

Richadella dulcifica 

ND 

98.4 

191 

A 4 tetramer 


ND, not determined; kDa, kilodaltons. 


process, or they are extracted from the roots of chicory plants or Jerusalem 
artichoke tubers. To develop an inexpensive means of producing fructans, 
researchers engineered sugar beet plants to convert the sucrose that they 
normally store in the vacuoles of taproot parenchyma cells into fructans. 
This was done by transforming sugar beets with a genetic construct that 
contained the l-sucrose:sucrose fructosyl transferase cDNA from Jerusalem 
artichokes under the transcriptional control of the 35S promoter. To trans¬ 
form sugar beets, the genetic construct was introduced into protoplasts in 
the presence of 20% polyethylene glycol. The transgenic sugar beets accu¬ 
mulated fructan up to 40% of the taproot dry weight. The mixture of fruc¬ 
tans that was produced by the transgenic sugar beet plants is essentially the 
same as the fructans that are produced enzymatically. Biologically produced 
fructan may be an attractive alternative to the current industrial process and 
can be sold as either a nondigestible sweetener or as a health component 
that improves the intestinal flora. 

Starch 

The starch that is found in most crop plants, such as potato, consists of 20 
to 30% (straight-chain) amylose and 70 to 80% (branched-chain) amylo- 
pectin. The ratio of amylose to amylopectin has a large influence on the 
physical and chemical properties of the starch. For many industrial appli¬ 
cations, it would be useful to have starch that is highly enriched in either 
amylose or amylopectin (Fig. 20.14). Starch is normally synthesized in a 
stepwise process (Fig. 20.15). The enzyme ADP-glucose pyrophosphory- 
lase catalyzes the transfer of ADP (from ATP) to glucose-l-phosphate to 
form ADP-glucose. The enzyme starch synthase catalyzes the transfer of 
glucose from ADP-glucose to the nonreducing end of a preexisting glucan 
chain, which is a short version of an amylose chain, by means of an a-1,4 
linkage. Branching can occur, catalyzed by starch-branching enzyme, when 
two glucan chains are joined by an a-1,6 linkage. To generate potatoes with 
a high percentage of amylose and a corresponding low percentage of amy¬ 
lopectin, plants were transformed by using A. tumefaciens with antisense 
versions of the starch-branching enzyme under the transcriptional control 
of the 35S promoter. In transgenic lines that exhibited only approximately 
1% of the normal amount of the starch-branching enzyme activity, the frac¬ 
tion of amylose increased from around 28% to 60 to 89% of the starch con¬ 
tent. This is an initial step toward developing potatoes with unique starch 
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FIGURE 20.14 (A) A portion of an amylose chain with only a-1,4 linkages; (B) a por¬ 
tion of an amylopectin chain with a-1,4 and a-1,6 linkages. 


compositions that may be used either for food or as a source of unusual 
starches with unique industrial properties. 

The freezing and thawing that are common in the development and 
use of frozen foods often lead to unwanted changes in the texture of the 
starch component, making the product less attractive to consumers. Under 
these conditions, the starch can separate into a solid phase and a liquid 
phase, a process known as syneresis. The long, unbranched amylose chains 
have a much greater tendency to form separate phases following freezing 
and thawing than the shorter, branched amylopectin chains. One approach 
to creating a starch that could withstand freezing and thawing would be to 
modify the starch structure by decreasing the synthesis of amylose chains. 
This was achieved by transforming potatoes with antisense versions of 
three different starch synthase genes (Fig. 20.15). This reduced the expres¬ 
sion of each of these genes to very low levels and resulted in a dramatic 
decrease in the amount of amylose and, at the same time, in the length of 
the amylopectin chains. These modified potatoes are currently being thor¬ 
oughly tested to ensure that a variety of physical and chemical properties 
are unaltered so that they can be used in the production of frozen foods. 

Since starch is both inexpensive and very abundant, it is used in a 
number of industrial processes as a thickener, a gelling agent, or an adhe¬ 
sive. In addition, it is a substrate for the production of high-fructose syrups 
and ethanol (see chapter 14). The production of high-fructose syrups 
requires the starch to be degraded, usually at high temperature, by the 
enzyme a-amylase and then treated, also at high temperature, with glucose 




820 CHAPTER 20 


Glucose-l-phosphate 

+ 


ATP 


ADP-glucose pyrophosphorylase 


ADP-glucose 



ADP-glucose 

+ 

Glucan chain 


Starch synthase 




Amylose 






Glucan chains 


Starch branching enzyme 


Amylopectin 


FIGURE 20.15 The major reactions in the biosynthesis of starch. 


isomerase, which converts the glucose (released from the starch that was 
degraded by a-amylase) into fructose. The enzymes that catalyze these 
reactions account for a major portion of the cost of the industrial conver¬ 
sion of starch into fructose. Therefore, reducing the cost of the enzymes 
makes the overall process more attractive. With this in mind, researchers 
constructed a bifunctional enzyme containing the essential regions of both 
a-amylase and glucose isomerase, using the polymerase chain reaction 
(PCR) to fuse together the genes for these enzymes (Fig. 20.16). Of course, 
care was taken to ensure that the chimeric gene retained the correct reading 
frames of the two genes. When the bifunctional enzyme was synthesized in 
E. coli, both a-amylase and glucose isomerase activities were found, and 
both activities were identical to that found in the thermotolerant source 
organisms. The chimeric gene was introduced into potato plants under the 
transcriptional control of the granule-bound-starch synthase promoter. 
Biochemical analysis performed on transgenic potatoes, after treatment at 
65°C (the optimal temperature for the a-amylase), revealed a 3.9-fold and 
a 14.7-fold increase compared with nontransgenic potatoes in the concen¬ 
trations of glucose and fructose, respectively. At 25°C, the glucose and 
fructose concentrations were the same in transgenic and nontransgenic 
potatoes. Thus, both enzymes function optimally at high temperatures and 
only very little or not at all at low temperatures. This work indicates that it 
is technically feasible to engineer potatoes to produce their own a-amylase 
and glucose isomerase and that the enzyme that is produced can then be 
used directly, without the need for any purification, in a process directed 
toward producing fructose. 

For many of the applications that employ plants containing high levels 
of starch, it would be desirable to be able to significantly increase the 
amount of starch produced by each plant. While it may be possible to 
manipulate the genes for some of the enzymes involved in the biosynthesis 
of starch, another approach to increasing the starch yield would be to 
increase the supply of ATP so that the flux through the first reaction of 
starch biosynthesis is increased (Fig. 20.15). This may be done by decreasing 
the level of the enzyme adenylate kinase, which catalyzes the interconver¬ 
sion of ATP and AMP into ADP (Fig. 20.17). To do this, a genetic construct 
encoding a plastid version of adenylate kinase in an antisense orientation 
was used to transform potato plants. In the resulting transgenic plants, the 
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FIGURE 20.16 Construction of an a-amylase-glucose isomerase fusion gene. The 
a-amylase and glucose isomerase genes were from Bacillus stearothermophilus and 
Thermus thermophilus, respectively. The portions of the PCR primers that do not 
match the target genes are designed to retain both genes in the same reading frame 
and to include appropriate restriction endonuclease sites. Adapted from Beaujean 
et al., Biotechnol. Bioeng. 70:9-16, 2000. 


activity of adenylate kinase decreased significantly in leaves and tubers 
and dramatically in chloroplasts (Table 20.5). Moreover, the transgenic 
plants showed an altered pattern of activity of the enzymes involved in the 
biosynthesis of starch and, most importantly, a significant increase in both 
the yield of potato tubers and the amount of starch in each potato tuber. 
This manipulation opens up the possibility of modifying other starch-con¬ 
taining plants in a similar manner. 


FIGURE 20.17 Regulation of adenylate 
pools by the action of the enzyme ade¬ 
nylate kinase. 

Adenylate 

kinase 

ATP + AMP ^ 2 ADP 


Genetic Manipulation of Flower Pigmentation 

The worldwide value of the flower industry, at the consumer level, is 
around $150 billion. This includes the value of cut flowers as well as pot 
and bedding plants. The main areas of both flower production and con¬ 
sumption are the United States, Europe, and, to a lesser extent, Japan and 
China. In addition, several countries, including Colombia, Ecuador, 
Ethiopia, Israel, Kenya, Morocco, and Turkey, are major producers but not 
major consumers of flowers. 

























822 


CHAPTER 20 


TABLE 20.5 Comparison of nontransformed potatoes with potatoes transformed 
with an antisense version of adenylate kinase 


Trait or activity 

Nontransformed 

Transgenic 

Leaf adenylate kinase activity 
(gmol/g [FW]/min) 

21.3 

14.4 

Chloroplast adenylate kinase activity 
(|rmol/g [FW]/min) 

12.4 

5.6 

Tuber adenylate kinase activity 
(gmol/g [FW]/min) 

15.2 

9.7 

AGPase activity 
(pmol/g [FW]/min) 

519 

717 

Soluble starch synthase activity 
(gmol/g [FW]/min) 

143 

109 

Branching enzyme activity 
(|rmol/g [FW]/min) 

6,921 

7,841 

Total tuber yield (g) 

867 

1,596 

Starch (g/plant) 

106 

224 


FW, fresh weight; AGPase, ADP-glucose pyrophosphorylase. 


The most important cut-flower crops are roses, carnations, tulips, lilies, 
gerberas, and chrysanthemums. Genetic transformation protocols have 
been worked out for most of the major commercial flower-producing 
plants. For example, transgenic chrysanthemums with both sense and anti- 
sense constructs of the chrysanthemum chalcone synthase cDNA have 
been produced. Chalcone synthase catalyzes the first step in anthocyanin 
biosynthesis (Fig. 20.18). Both the sense and the antisense cDNAs can sup¬ 
press chalcone synthase gene expression in transgenic plants and produce 
white flowers instead of the normal pink. Sense suppression, which is also 
called cosuppression, occurs when an additional copy of an endogenous 
gene prevents the accumulation of the messenger RNA (mRNA) from the 
endogenous gene, typically through the production of interfering RNA. On 
the other hand, the antisense chalcone synthase RNA should block transla¬ 
tion of endogenous chalcone synthase mRNA. 

The sense and antisense constructs were placed under the control of 
the cauliflower mosaic virus 35S promoter on a binary Ti plasmid vector 
and then introduced into plant cells. Three of the 133 sense transformants 
and three of the 83 antisense transformants produced white flowers, which 
indicated that endogenous chalcone synthase gene expression and, as a 
consequence, anthocyanin synthesis had been suppressed. The white¬ 
flowering plants were propagated vegetatively through cuttings, and 
approximately 90 to 98% of the plants continued to produce white flowers 
when planted in the field. 

The flower industry is continually attempting to improve flower 
appearance and postharvest lifetime. By traditional breeding techniques, 
over the years it has been possible to create thousands of new varieties that 
differ from one another in color, shape, and plant architecture. However, 
traditional plant breeding is a slow and painstaking procedure that is lim¬ 
ited by the gene pool of a particular species; thus, for example, no one has 
been able to breed a blue rose. As an alternative to traditional breeding 
techniques, uniquely colored flowers can be developed by manipulating 
the genes for enzymes in the anthocyanin biosynthesis pathway. 
Anthocyanins, which are a class of flavonoids, are the most common type 
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FIGURE 20.18 Biosynthesis of anthocyanins. CoA, coenzyme A; CHS, chalcone syn¬ 
thase; CHI, chalcone isomerase; F3H, flavonone 3-hydroxylase; F3'H, flavonoid 
3'-hydroxyase; F3'5'H, flavonoid 3',5'-hydroxylase; DFR, dihydroflavonol 
4-reductase; 3GT, UDP-glucose:flavonoid 3-O-glucosyltransferase. Petunia DFR can 
convert dihydroquercetin to cyanidin-3-glucoside and can convert dihydro¬ 
myricetin to blue delphinidin-3-glucoside. In addition to these conversions, corn 
DFR can convert dihydrokaempferol to pelargonidin-3-glucoside. 


of flower pigment and are the major constituent in orange, red, violet, and 
blue flowers. They are synthesized from the amino acid phenylalanine by 
a series of enzyme-catalyzed reactions. The color of the flower is deter¬ 
mined by the chemical side chain substitutions of different chemical struc¬ 
tures, with the cyanidin derivatives producing more red and the delphinidin 
derivatives producing more blue (Fig. 20.18). Moreover, plants sometimes 
contain both flavonoids and carotenoids, and it is the combination of the 
two that produces the wide range of colors seen in nature. 
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While the petunia enzyme dihydroflavonol 4-reductase can convert 
colorless dihydroquercetin to red cyanidin-3-glucoside and colorless dihy- 
dromyricetin to blue delphinidin-3-glucoside, it cannot use colorless dihy- 
drokaempferol as a substrate (Fig. 20.18). However, when petunias were 
transformed with a dihydroflavonol 4-reductase gene from corn, the 
flowers of the transgenic plants were brick red-orange. This unique color, 
which had never been seen before in petunias, was due to the production 
of pelargonidin-3-glucoside by the transgenic plants. Moreover, following 
laboratory manipulation of transgenic flower pigmentation and subse¬ 
quent field testing, in 1996, the company Florigene introduced Moondust, 
a mauve carnation, into the marketplace, followed by Moonshadow, a 
violet carnation, in 1998. Conventional breeding had failed to produce 
these flowers with hues in the mauve-blue-violet range because they 
lacked the ability to produce the blue pigment, delphinidin. Four addi¬ 
tional varieties of carnations that feature different tones of violet and blue 
have been added. To date, over 75 million of these flowers have been sold 
worldwide. 

By mid-2009, several uniquely colored transgenic roses were under¬ 
going field trials and were expected to be available to consumers around 
2010. More than two dozen field tests with new designer plants have been 
permitted. Among them are light-blue torenias, bronze-colored forsythia, 
and yellow petunias. 

In one recent study, the production of anthocyanin pigment 1 (Papl) 
Myb transcription factor from the plant A. thaliana was stably introduced 
into petunia plants (Petunia hybrida). This transcription factor is known to 
regulate the production of nonvolatile phenylpropanoids, including antho- 
cyanins. Surprisingly, in addition to an increase in pigmentation, Papl- 
transgenic petunia flowers demonstrated a very large increase in the 
production of volatile phenylpropanoid-benzenoid compounds. This coor¬ 
dinated regulation of petunia flower color and scent production by Papl 
provides a clear advantage for plant survival in terms of attracting polli¬ 
nator insects. In addition, this work suggests a strategy to create flowers 
with both novel colors and enhanced scents. 

The carotenoid astaxanthin, which provides the characteristic pink 
color to salmon, trout, and shrimp, is synthesized by marine bacteria and 
microalgae and then passed on to fish through the food chain. More impor¬ 
tant, astaxanthin protects salmon and trout eggs from damage by UV 
radiation and improves the survival and growth rate of juveniles. Most 
likely, these properties of astaxanthin are related to its function as a pow¬ 
erful antioxidant. However, when fish are grown in aquaculture, they are 
separated from the natural food chain and astaxanthin must be added to 
their feed in order to impart the typical pink color to their flesh. Currently, 
astaxanthin is chemically synthesized and accounts for approximately 15% 
of the total cost of salmon farming. 

To produce astaxanthin biologically, one group of researchers first 
cloned a cDNA encoding the enzyme (3-carotene ketolase ((3-C-4 oxyge¬ 
nase) from the unicellular green alga Haematococcus pluvialis. When this 
cDNA was expressed in tobacco plants that contain (3-carotene and the 
gene for (3-carotene hydroxylase, astaxanthin was synthesized. By appro¬ 
priate genetic manipulation, astaxanthin has been synthesized in tobacco 
flowers. The cDNA for the algal (3-carotene ketolase was fused to DNA 
encoding a chloroplast transit peptide and used to transform tobacco 
plants. To limit the expression of astaxanthin to flowers and fruits, the 
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[3-Glucuronidase activity, relative units 



FIGURE 20.19 Activity of the (3-glucuronidase reporter protein directed by the pds 
promoter and two of its deletion derivatives. The full-length promoter is approxi¬ 
mately 1.0 kb, the deletion 1 derivative is 0.7 kb, and the deletion 2 derivative is 0.44 
kb. All deletions are from the 5' end of the DNA promoter fragment. 


cDNA for the algal (3-carotene ketolase was fused to the promoter of the 
tomato pds gene, which is active primarily in tomato reproductive tissues. 
To increase the expression of the cDNA for the algal (3-carotene ketolase, the 
DNA fragment carrying the pds promoter was modified by deleting por¬ 
tions of its DNA sequence and then fused to a (3-glucuronidase gene (Fig. 
20.19). The construct, which had a 305-bp deletion from the 5' terminus of 
the pds promoter, showed a decrease in (3-glucuronidase activity in leaves, 
sepals, and petals and a very large increase in activity in flower ovaries 
(nectaries). To obtain maximal gene expression in ovaries, the 305-bp 
deleted pds promoter was placed upstream of the cDNA for the algal 
(3-carotene ketolase (Fig. 20.20A). The net result of all of these genetic 
manipulations was that, once the introduced transferred DNA (T-DNA) 
had been inserted into the plant genomic DNA, the algal (3-carotene keto¬ 
lase, together with a transit peptide, was inserted through the chromoplast 
membrane, with the transit peptide being removed in the process. Once 
inside the chromoplast, the algal (3-carotene ketolase worked in concert 
with the endogenous (3-carotene hydroxylase to convert the (3-carotene to 
astaxanthin, which accumulated in the flower nectaries (Fig. 20.20B). The 
advantage to producing astaxanthin in plants rather than bacteria or other 
microorganisms is that plants can store large amounts of carotenoids inside 
cells in lipid vesicles within the plastids. Thus, plants can accumulate 10- to 
50-fold-higher concentrations of carotenoids than microorganisms, whose 
membranes are damaged by high concentrations of carotenoids. These 
manipulations have the potential to dramatically lower the cost of the 
astaxanthin that is used in salmon farming. 


Plants as Bioreactors 

Plants are easy to grow and can generate considerable biomass. With these 
features in mind, research has been carried out to determine whether trans¬ 
genic plants can be used for the production of commercial proteins and 
chemicals. Unlike recombinant bacteria, which are grown in large bioreac¬ 
tors, a process that requires highly trained personnel and expensive equip¬ 
ment, crops can be produced relatively inexpensively by less-skilled 
workers (Table 20.6). In addition, when proteins that are intended for 
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FIGURE 20.20 (A) Schematic representation of the genetic construct used to overpro¬ 
duce astaxanthin in tobacco flowers. The activity of the tomato pds gene deletion 1 
promoter is shown in Fig. 20.19. The (3-carotene ketolase gene is from the green alga 
H. pluvialis. The chloroplast transit peptide ensures that the protein will be 
expressed in the chromoplast. (B) The conversion of (3-carotene to astaxanthin. 


human use are produced in transgenic plants, there is a significantly 
reduced risk of mammalian virus contamination in comparison to proteins 
that are produced in animal cells grown in culture. Ultimately, the biggest 
hurdle to overcome in the production of foreign proteins in plants is the 
purification of the product of a transgene from the mass of plant tissue. On 
a laboratory scale, plants have been used to produce monoclonal antibodies 
and antibody fragments; the polymer polyhydroxybutyrate, which can be 
used to make a biodegradable plastic-like material; and a number of poten¬ 
tial therapeutic agents (Table 20.7) and vaccine antigens (Table 20.8). 


TABLE 20.6 Comparison of recombinant protein production in plants and other systems 


Parameter 

Bacteria 

Yeast 

Mammalian cell 
culture 

Transgenic plants 

Glycosylation 

None 

Incorrect 

Correct 

Generally correct 

Assembles multimeric proteins 

Limited 

Limited 

Limited 

Yes 

Production costs 

Medium 

Medium 

High 

Low 

Protein-folding accuracy 

Low 

Medium 

High 

High 

Protein yield 

High 

High 

Medium 

Medium 

Scale-up costs 

High 

High 

High 

Low 

Time required 

Low 

Low 

High 

Medium 

Skill level required for growth 

Medium 

Medium 

High 

Low 
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TABLE 20.7 Some of the therapeutic agents produced in transgenic plants 


Protein 

Plant(s) 

Appiication(s) 

Human protein C 

Tobacco 

Anticoagulant 

Human hirudin variant 2 

Tobacco, canola, 
Ethiopian mustard 

Anticoagulant 

Human granulocyte-macrophage 
colony-stimulating factor 

Tobacco 

Neutropenia 

Human erythropoietin 

Tobacco 

Anemia 

Human enkephalins 

Thale cress, canola 

Antihyperanalgesic by opiate activity 

Human epidermal growth factor 

Tobacco 

Wound repair, control of cell proliferation 

Human a-interferon 

Rice, turnip 

Hepatitis C and B 

Human serum albumin 

Potato, tobacco 

Liver cirrhosis 

Human hemoglobin 

Tobacco 

Blood substitute 

Human homotrimeric collagen I 

Tobacco 

Collagen synthesis 

Human al-antitrypsin 

Rice 

Cystic fibrosis, liver disease, hemorrhage 

Human growth hormone 

Tobacco 

Dwarfism, wound healing 

Human aprotinin 

Corn 

Trypsin inhibitor for transplantation surgery 

Angiotensin-l-converting enzyme 

Tobacco, tomato 

Hypertension 

a-Tricosanthin 

Tobacco 

HIV therapy 

Glucocerebrosidase 

Tobacco 

Gaucher disease 

Human muscarinic cholinergic receptors 

Tobacco 

Central and peripheral nervous system 

Human interleukin-2 and interleukin-4 

Tobacco 

Immunotherapy 

Human placental alkaline phosphatase 

Tobacco 

Children with achonodroplasia or cretinism 

Human insulin 

Safflower 

Diabetes 

Trout growth factor 

Tobacco 

Fish growth 

Lipase 

Corn 

Cystic fibrosis 

Lactoferrin 

Rice 

Diarrhea 


HIV, human immunodeficiency virus. 


Antibodies 

The production of antibodies and antibody fragments in transgenic plants 
has several potential advantages over their synthesis in recombinant micro¬ 
bial cells (Table 20.6). For example, transformation of plants generally 
results in the stable integration of the foreign DNA into the plant genome, 
while most microorganisms are transformed with plasmids that can be lost 
during a prolonged or large-scale fermentation. In addition, the processing 
and assembly of foreign proteins in plants are similar to those in animal 
cells, whereas bacteria do not readily process, assemble, or posttranslation- 
ally modify eukaryotic proteins. Moreover, plants are inexpensive to grow 
on a large scale, and their production is not limited by fermentation capa¬ 
bility—it is estimated that it costs approximately $5,000 per gram to pro¬ 
duce antibodies from hybridoma cells in culture, $1,000 per gram to 
produce antibodies from transgenic bacteria, and $10 to $100 per gram to 
produce antibodies from transgenic plants. However, since most harvested 
plant tissues cannot usually be stored for long periods, foreign proteins 
might be produced in seeds, where they will be stable for long periods 
under ambient conditions. To date, a large number of antibodies, including 
immunoglobulin G (IgG), IgM, single-chain Fv fragments, and Fab frag¬ 
ments, have been produced in plants (Table 20.9). Some of these plant- 
produced antibodies (sometimes called plantibodies) have been purified 
and used for diagnostic and therapeutic purposes, and others have been 
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TABLE 20.8 Some recombinant vaccine antigens expressed in plants 


Vaccine antigen 

Plant(s) or vector 

Hepatitis virus B surface proteins 

Tobacco, potato, yellow lupin, lettuce 

Malaria parasite antigen 

Virus 

Rabies virus glycoprotein 

Tomato 

Human rhinovirus 14 and human 

immunodeficiency virus (HIV) epitopes 

Virus 

E. coli heat-labile enterotoxin 

Tobacco, potato 

Norwalk virus capsid protein 

Tobacco, potato 

Diabetes-associated autoantigen 

Tobacco, potato, carrot 

Mink enteritis virus epitope 

Virus 

Rabies and HIV epitopes 

Virus 

Foot and mouth disease VP1 structural protein 

Arabidopsis, alfalfa 

Cholera toxin B subunit 

Potato 

Human insulin-cholera toxin B subunit fusion protein 

Potato 

Human cytomegalovirus glycoprotein B 

Tobacco 

Dental caries (S. mutans) 

Tobacco 

Respiratory syncytial virus 

Tomato 


Note that in some cases the antigen was cloned into a transient-expression system, such as a plant virus (usually tobacco mosaic 
virus), that could be sprayed onto the leaves of a variety of different plants and begin producing protein within 2 weeks. 


TABLE 20.9 Some antibodies and antibody fragments that have been 
produced in plants 


Host plant 

Antigen 

Tobacco 

Phosphonate ester 

Tobacco 

(4-Hy droxy-3-nitrophenyl) acetyl 

Tobacco 

Phytochrome 

Tobacco 

Artichoke mottled crinkle virus 

Tobacco 

Human creatine kinase 

Tobacco 

Streptococcus mutans cell surface antigen SAI/II 

Tobacco 

Fungal cutinase 

Tobacco 

Oxazolone 

Tobacco 

Abscisic acid 

Tobacco 

Cell surface protein from mouse B-cell lymphoma 

Tobacco 

Human carcinoembryonic antigen 

Tobacco 

Tobacco mosaic virus 

Tobacco 

Gibberellin 

Tobacco 

Beet necrotic yellow vein virus coat protein 

Tobacco 

Stolbur phytoplasma membrane protein 

Tobacco 

Root rot nematode surface glycoprotein 

Petunia 

Dihydrofolate reductase 

Soybean 

Herpes simplex virus 

Pea 

Abscisic acid 

Pea 

Human cancer cell surface antigen 

Tobacco 

substance P (neuropeptide) 

Tobacco 

CD40 (cell surface protein) 

Tobacco 

38C13 mouse B-cell lymphoma 

Alfalfa 

Human IgG 
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used to protect the plant against certain pathogenic agents, such as viruses. 
In a small number of instances, plant-produced antibodies are being tested 
in clinical trials to determine whether they are essentially equivalent to 
antibodies produced in other host cells. 

The effective industrial-scale production of antibodies synthesized by 
plants has until recently been hampered by a very low yield, typically in 
the range of 1 to 10 pg/g of fresh biomass. Moreover, with conventional 
transgenic plant technology, it is estimated that it takes about 2 years from 
the beginning of the cloning process to produce gram quantities of anti¬ 
body. The use of transient-expression systems can significantly speed up 
this process, but it is still problematic to coordinate the synthesis and 
assembly of the two different polypeptides that are integral components of 
antibody molecules. Moreover, traditional transient-expression systems are 
limited by the low infectivity of the viral vectors, especially those carrying 
medium-size to large inserts. It is also necessary to ensure that transient- 
expression systems based on viral vectors do not spread to plants other 
than the intended host. This is typically done by using mutant plant viruses 
that are unable to produce capsid (coat) proteins and therefore cannot form 
functional viral particles. However, these systems generally produce only 
very low levels of active antibody. Fortunately, an alternative transient- 
expression system has been developed. This expression system involves 
coinfection of plant cells with two separate plant virus vectors, one based 
on tobacco mosaic virus and the other on potato virus X, so that the two 
vectors do not compete with one another but rather can coexist within the 
same cell. These vectors coreplicate within the plant with each vector 
expressing a different antibody chain, i.e., one expresses the light chain and 
the other expresses the heavy chain (Fig. 20.21). This system has been used 
to increase the amount of IgG antibody that is typically synthesized by a 
plant transient-expression system by about 100-fold, so that it is possible to 
produce up to 0.5 mg of assembled IgG antibody per gram of fresh leaf 
biomass. At this high level of expression, it is possible to grow transgenic 
plants that produce specific IgG molecules in small areas indoors in con¬ 
trolled greenhouses, thereby avoiding environmental concerns regarding 
the inadvertent "escape" of antibody-producing plants. The potential of 
this technology provided the incentive for the sale of the company where it 
was originally developed to a larger company that hopes to open a clinical- 
grade manufacturing plant with the objective of beginning clinical trials of 
the antibodies produced. 


FIGURE 20.21 Schematic representation of a portion of the viral vectors used to pro¬ 
duce full-size IgG antibodies in plants. In each case, the viral replicase and move¬ 
ment protein (MP) are cloned together with the gene (cDNA) for either a light chain 
(LC) or a heavy chain (HC). Each antibody gene is controlled by a promoter, a signal 
peptide (to ensure secretion), and a transcription termination region from the 
appropriate virus, none of which are shown. The viruses used included tobacco 
mosaic virus (TMV) and potato virus X (PVX). In both cases, the viruses were 
unable to replicate because of the absence of the viral coat protein. Recombinant 
viruses were introduced into plants as part of the T-DNA that is transferred during 
A. tumefaciens infection. 
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Polymers 

It is costly to produce the polymer poly(3-hydroxybutyric acid), which is 
used in the synthesis of biodegradable plastics, by bacterial fermentation. 
Consequently, research has been conducted to determine if the polymer 
could be produced at a lower cost in plants. In bacteria, such as Alcaligenes 
eutrophus, poly(3-hydroxybutyric acid) is synthesized from acetyl coen¬ 
zyme A in three steps catalyzed by three enzymes (see Fig. 13.36). The 
genes that encode these enzymes are organized on a single operon. Since 
plants are unable to process the transcript of an operon with more than one 
gene, each of the three genes was isolated and cloned separately into a 
plasmid. The genes were targeted to the chloroplast of the plant A. thaliana 
because previous experiments had demonstrated that cytoplasmic syn¬ 
thesis produced only low levels of the polymer, and the transgenic plants 
were highly stunted. Moreover, chloroplasts can accumulate high levels of 
starch, another biological polymer, so it was thought that they would simi¬ 
larly be able to accumulate large amounts of poly(3-hydroxybutyric acid). 
Unlike highly valued proteins that are used as therapeutics or specialty 
chemicals, biological polymers need to be produced at high levels in plants 
for production to be economically feasible. 

Each of the three poly(3-hydroxybutyric acid) biosynthesis genes was 
fused to a DNA fragment that encodes the chloroplast transit peptide of the 
small subunit of pea ribulose bisphosphate carboxylase and was placed 
under the transcriptional control of the cauliflower mosaic virus 35S pro¬ 
moter. Separate plants were transformed with each construct with Ti 
plasmid binary vectors. Two transgenic plants, each with a different foreign 
gene in its genomic DNA, were crossed to form a transgenic plant with two 
foreign genes. Then the double-gene-transgenic plant was crossed with a 
transgenic plant that carried the third foreign gene, and a transgenic plant 
carrying all three of the bacterial poly(3-hydroxybutyric acid) biosynthesis 
genes was selected. Mature leaves of some of the triple-gene-transgenic 
plants produced more than 1 mg of poly(3-hydroxybutyric acid) per gram 
(fresh weight) of leaf. Unfortunately, A. thaliana plants that produced very 
high levels of poly(3-hydroxybutyric acid) were severely stunted. 
Nevertheless, this work is a first step in the development of plants that 
produce large amounts of poly(3-hydroxybutyric acid). However, to realize 
the commercial potential of this system, these polymers will have to be 
produced in plants other than A. thaliana so that there will be a much 
greater amount of plant biomass produced. 


Edible Vaccines 

Although considerable progress has been made in recent years in the devel¬ 
opment of new vaccines, in many countries, either the vaccine itself is too 
expensive to be used on a large scale or there is a lack of physical infrastruc¬ 
ture (e.g., roads and refrigeration) that makes it impossible to disseminate 
the vaccine. Commercial vaccines are expensive to produce and package 
and require trained personnel to administer injections. Clearly, it would be 
advantageous if vaccines could be delivered inexpensively on a broad scale 
in an edible form, e.g., as part of a fruit or vegetable. When vaccines are 
taken orally, they can directly stimulate the immune system (Fig. 20.22). An 
edible vaccine, in contrast to traditional vaccines, would not require elabo¬ 
rate production facilities, purification, sterilization, packaging, or special- 
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FIGURE 20.22 Schematic representation of how an edible vaccine generates an 
immune response against an antigen from an infectious agent. The ingested antigen, 
which is expressed as part of a plant, binds to and is taken up by M cells present in 
the lining of the intestine and is then passed to other cells in the immune system, 
including macrophages and B cells. The macrophages display portions of the 
antigen to the helper T cells, which in turn respond by secreting small molecules 
that activate B cells to synthesize and release antibodies that can neutralize the 
antigen. 


ized delivery systems. Moreover, unlike many currently utilized recombinant 
protein expression systems, plants glycosylate proteins, a factor that may 
contribute to the immunogenicity and stability of a target protein. Much of 
the work on edible vaccines that has been reported so far utilizes potatoes 
as the delivery vehicle. Potatoes were originally chosen for this work 
because they were easy to manipulate. However, potatoes were never 
intended to be the vaccine delivery plant; they require cooking to make 
them palatable, and cooking destroys (inactivates) most protein antigens. 
Plants that are being considered for the delivery of edible vaccines include 
bananas (although banana trees require several years to mature), tomatoes 
(although tomatoes spoil readily), lettuce, carrots, peanuts, and com 
(mainly for "vaccinating" animals). 

Cholera is an infectious diarrheal disease caused by the enterotoxin 
produced by the gram-negative bacterium Vibrio choleme (see chapter 12). 
Globally, there are more than 5 million cases and 200,000 deaths from 
cholera each year. To test whether it might be possible to develop an edible 
vaccine against V. choleme, potato plants were transformed, using A. tume- 
faciens, with the cholera toxin subunit B gene. Cholera toxin subunit B 
binds to an intestinal receptor; subunit A contains the toxin activity. One 
gram of transgenic potato produced approximately 30 pg of subunit B. 
After the transgenic potatoes were cooked in boiling water until they were 
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FIGURE 20.23 Schematic representation 
of the association of two components of 
an engineered vaccine. The cholera 
toxin B subunit is synthesized as a 
fusion protein to the rotavirus peptide, 
the cholera toxin A2 subunit is fused to 
the enterotoxigenic E. coli fimbrial colo¬ 
nization factor, and the cholera toxin B 
and A2 peptides are noncovalently 
attached to one another. Both fusion 
proteins are made in transgenic pota¬ 
toes and combine to form a cholera 
toxin-like protein, as shown. 



Cholera toxin 
B subunit 


Cholera toxin 
A2 peptide 


Rotavirus 

peptide 


E. coli fimbrial 

colonization 

factor 


soft enough to be edible by humans, approximately 50% of the subunit B 
protein remained undenatured. The cooked potatoes were fed to mice once 
a week for 4 weeks before the mice were tested for antibodies against the 
subunit B protein and for resistance to V. cholerae -caused diarrhea. These 
tests indicated that the mice had acquired a significant level of protection 
against V. cholerae. Moreover, although mucosal antibody titers declined 
gradually after the last immunization, they were rapidly restored after an 
oral boost (an additional feeding) of transgenic potato. In an interesting 
variation of the strategy outlined above, the cholera toxin subunit B and A2 
genes were each fused to different antigen genes and then used to generate 
transgenic potato plants. To create these two fusion proteins, a 22-amino- 
acid epitope from murine (mouse) rotavirus enterotoxin NSP4 was fused to 
the C-terminal end of the cholera toxin subunit B protein, and the entero¬ 
toxigenic E. coli fimbrial colonization factor CFA/I was fused to the 
N-terminal end of the cholera toxin subunit A2 protein (Fig. 20.23). 
Normally, the A2 peptide links the At peptide, which has the toxic activity, 
with the subunit B peptide, which has the binding activity. Transgenic 
potatoes that expressed the two fusion proteins were fed to mice, which 
generated antibodies against cholera toxin subunit B protein, murine rota¬ 
virus enterotoxin NSP4, and E. coli fimbrial colonization factor CFA/I and 
were protected against rotavirus-caused diarrhea. This approach holds 
great promise for the development of inexpensive and readily available 
vaccines for a wide range of diseases. 

It has been estimated that Shiga toxin-producing strains of E. coli cause 
approximately 100,000 cases of hemorrhagic colitis a year. About 6% of 
those infections produce severe complications, including kidney failure. 
Similar to cholera toxin, the Shiga toxin contains one A subunit, which 
encodes the catalytic, or toxin, activity, per five B subunits, which act 
(together) to bind to cellular surface receptors. To develop an oral vaccine 
against type 2 Shiga toxin (type 2 is responsible for the most severe disease 
in humans), the genes for a genetically inactivated version of the Shiga 
toxin A and B peptides were both cloned and expressed in tobacco cells 
(Fig. 20.24). To test the ability of transformed tobacco plants that synthe¬ 
sized the modified Shiga toxin to protect mice against the toxin, scientists 
infected mice with Shiga toxin-producing strains of £. coli. Before the intro¬ 
duction of the Shiga toxin-producing bacteria, some of the mice were fed 
leaves from transgenic tobacco plants expressing the inactivated Shiga 
toxin once a week for 4 weeks, while other mice were untreated. One week 
after the introduction of the toxic E. coli strain, all of the mice that were not 
fed the antigen-producing tobacco had died. By contrast, 2 weeks after 
treatment with the toxic £. coli strain, all of the orally vaccinated mice were 
still alive. This experiment serves as a proof of the concept that oral admin¬ 
istration of the inactivated Shiga toxin is a highly effective means of pro¬ 
tecting animals against Shiga toxin-producing £. coli. Of course, for a 
human oral vaccine, a more suitable host plant, such as tomato or banana, 
would be desirable. 


Plant Yield 

In nearly every instance where plants are utilized, it is desirable to optimize 
the yield of the plant or its components. Theoretically, increases in plant 
yield may be achieved in a number of ways. For example, preliminary 
studies directed toward introducing the C 4 pathway of photosynthesis in 
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FIGURE 20.24 Schematic representation of Shiga toxin A and B subunits encoded in a 
bacterial operon (stxA and stxB genes, respectively) under the control of a single 
promoter and transcription terminator. These two genes were subsequently isolated 
and modified, first to inactivate the A subunit and then to eliminate sequences on 
both genes that might adversely affect their transcription in plant cells. The modified 
genes were inserted between the left and right borders (LB and RB) of the T-DNA of 
a binary Ti plasmid-based vector, each under the control of a separate cauliflower 
mosaic virus 35S promoter. The T-DNA also contained a gene encoding kanamycin 
resistance under the control of its own promoter. The transcription terminator 
sequences of the three genes contained within the T-DNA are not shown for the sake 
of simplicity. The final construct was used to transform tobacco plants. 


plants that normally use the C 3 pathway have been reported; plants have 
been engineered to take up iron from the soil more efficiently; the supply 
of oxygen to plant cells has been enhanced, thereby increasing the rate of 
plant growth; and the lignin content of some trees has been decreased, 
making it easier to isolate the cellulose. The work of increasing plant yields 
is still in its infancy. Nevertheless, based on what has been achieved to date, 
the various strategies are likely to be fruitful. 

Increasing Iron Content 

Plant growth is often limited by the availability of iron. Despite the abun¬ 
dance of this element on the earth's surface, plants have difficulty obtaining 
enough iron to support their growth, because the iron in soil, especially in 
alkaline soil, is largely present as insoluble ferric hydroxides, which cannot 
be readily transported into cells. To solve this problem, bacteria, fungi, and 
plants secrete small, specialized iron-binding molecules, called sidero- 
phores, that scavenge iron. Once bound, the now soluble iron-siderophore 
complex is taken up by specific receptors on the surfaces of these organ¬ 
isms; after reduction to the ferrous state, the iron is released from the 
siderophore. 
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FIGURE 20.25 Overview of the biosyn¬ 
thesis of the mugineic acid family of 
phy tosiderophores. 


Siderophores are low-molecular-weight molecules, usually less than 1 
kilodalton, with three functional, or iron-binding, groups connected by a 
flexible backbone. Each functional group presents two atoms of either 
oxygen or (less commonly) nitrogen that bind to iron. In chemical terms, 
the functional groups are bidentate, and trivalent ferric iron can accommo¬ 
date three of these groups to form a six-coordinate complex (see Fig. 15.8). 
The functional groups of microbial siderophores are usually either hydrox- 
amates or catecholates. Other functional groups include carboxylate moi¬ 
eties (such as citrate) and ethylenediamine. Plant siderophores, on the 
other hand, are linear hydroxy- and amino-substituted iminocarboxylic 
acids, such as mugineic acid and avenic acid. 

Since rice plants secrete only very small amounts of mugineic acid and 
are highly susceptible to growth inhibition from iron deficiency, it was rea¬ 
soned that increasing the amount of mugineic acid should enable the plant 
to take up more iron and hence increase the yield of the plant. Mugineic acid 
is synthesized in several steps from the amino acid L-methionine (Fig. 
20.25). To increase the amount of mugineic acid, rice plants were trans¬ 
formed (using Agrobacterium) with an 11-kb fragment of barley genomic 
DNA containing two naat genes, naat-A and naat-B, encoding the subunits 
of the enzyme nicotianamine aminotransferase. Both of these genes were 
under the transcriptional control of their native promoters. In the resulting 
transgenic rice plants, the pattern of expression of the two proteins was the 
same as in barley: expression in roots was very low in the presence of high 
levels of iron and was high in the presence of low levels of iron. After 16 
weeks of growth in an alkaline soil, the shoot dry weight and grain yield of 
the transgenic plants were more than four times those of the nontransgenic 
plants, an enormous difference in yield. Hopefully, this spectacular gain will 
be realized in the field as well as under controlled laboratory conditions. 

Altering Lignin Content 

After cellulose, lignin is the second most abundant organic compound on 
Earth. Depending on the species of tree, it accounts for approximately 15 to 
35% of the dry weight of wood. While lignin is important in the mechanical 
support of trees and in their defense against pathogens, it is a major imped¬ 
iment in obtaining the cellulose that is needed by the pulp and paper 
industry—releasing cellulose requires harsh physical and chemical treat¬ 
ment. In the United States, over 80 million tons of wood pulp is produced 
annually. This process, which removes nearly 30 million tons of lignin from 
wood, consumes enormous amounts of energy and chemicals. In addition, 
high levels of lignin decrease the nutritional value of forage crops that are 
used for cattle feed. 

Lignin is synthesized by the oxidative polymerization of one of three 
hydroxycinnamyl alcohols: p-coumaryl, coniferyl, or sinapyl alcohol. In 
one approach to reduce the level of lignin produced by quaking aspen trees 
(Populus tremuloides) and to make it easier to harvest the cellulose, one of 
the steps in the biosynthesis of lignin was altered. The enzyme 
4-coumarate:coenzyme A ligase catalyzes the conversion of 4-coumarate to 
4-coumarate:coenzyme A. The latter compound is a precursor of both fla- 
vonoids and lignin (Fig. 20.26). Moreover, there are two isoforms of the 
enzyme 4-coumarate:coenzyme A ligase. One isoform is expressed in the 
epidermis of stems and leaves and is probably involved in flavonoid bio¬ 
synthesis. The other isoform is expressed in differentiating xylem and 
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FIGURE 20.26 Role of 4-coumarate:coenzyme A (CoA) ligase in lignin biosynthesis. 

presumably functions mainly in lignin biosynthesis. The gene for the iso¬ 
form expressed in xylem was isolated, and an antisense version of the gene 
was placed under the transcriptional control of the 35S promoter and intro¬ 
duced into aspen by A. tumefaciens-mediated transformation. Transformed 
aspen trees exhibited up to a 45% decrease in lignin content and, at the 
same time, as much as a 15% increase in cellulose. In addition, the trans¬ 
genic trees were about 25% larger and had thicker stems and larger leaves 
than the nontransgenic trees. The altered lignin and cellulose content of the 
transgenic aspen should make it easier to extract cellulose during pulp and 
paper manufacture. 

Recently, a considerable amount of effort has been directed toward 
finding ways to efficiently utilize plants to produce renewable biofuels. 
Some of this research has focused on converting materials such as com 
stover, wheat straw, grasses, and wood by-products to glucose and then to 
ethanol. However, with all of these compounds, it is necessary to pretreat 
the substrate under harsh conditions to remove the lignin and make the cel¬ 
lulose and hemicellulose more accessible to hydrolytic enzymes. These 
pretreatment steps are generally costly in terms of materials and energy, and 
often the harsh conditions employed make it much more difficult to enzy¬ 
matically digest the cellulose. However, by reducing the lignin content of 
plant tissues, it may be possible to employ mild pretreatment or even no 
pretreatment prior to enzymatic digestion (Fig. 20.27). In fact, by con¬ 
structing transgenic alfalfa plants that contained an antisense version of the 
gene for the enzyme skimiate hydroxycinnamoyl transferase, which cata¬ 
lyzes one of the steps in the biosynthesis of lignin, it was possible to reduce 
the lignin content of the plant by about 50 to 70%. Interestingly, the trans¬ 
genic plants that had the lowest lignin content also had the highest carbo¬ 
hydrate levels, suggesting that the plant has somehow compensated for the 
reduction in lignin. Moreover, when the lignin levels are low, the cell wall 
carbohydrates are more accessible to cellulose- and hemicellulose-degrading 
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FIGURE 20.27 Schematic representation of native (A) and transgenic (B) plant cell 
walls treated to partially break down their structure and make the hemicellulose 
and cellulose components more accessible to enzymatic digestion. Lignin is shown 
in black, cellulose in blue, and hemicellulose in red. 


enzymes. Importantly, since the biochemical steps that lead to the synthesis 
of lignin are highly conserved across the plant kingdom, it may be possible 
to apply this approach to a variety of plants so that, for example, corn may 
be engineered to continue to be a food source for both animals and humans 
while the parts of the com plant that are not consumed may be converted 
into biofuels. 

Erect Leaves 

Conventional varieties of wheat and rice allocate a considerable amount of 
their resources to the growth of vegetative tissues. On the other hand, 
semidwarf varieties, developed by the conventional breeding strategies 
that were part of the "green revolution," allocate more of their resources 
to grain. In addition, semidwarf varieties are more resistant to damage 
from wind and rain and produce much less residual (waste) biomass. 
However, to feed an ever-increasing world population, it is necessary to 
engineer plants to produce even higher grain yields. One way to do this is 
to develop dwarf strains of rice and wheat that accumulate even less veg¬ 
etative material and put more resources into increasing the grain yield. 
Scientists have approached this problem by using genetic engineering to 
manipulate the levels of some plant hormones, most notably gibberellins 
and brassinosteroids. 
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Brassinosteroids include more than 40 different plant polyhydroxy- 
lated sterol derivatives. They affect a variety of plant processes, typically at 
low concentrations, including cell elongation, vascular development, and 
stress tolerance. By genetically engineering rice plants to have a decreased 
level of brassinosteroids, scientists were able to increase the grain yield 
when the crop was planted at a high density. 

The rate-limiting step in the synthesis of brassinosteroids is the 
hydroxylation of carbon 22 (Fig. 20.28). In rice, the hydroxylation of C-22 is 
controlled by two differentially regulated genes. One gene, called 
OsDWARF4Ll, contributes primarily to brassinosteroid synthesis in shoots. 
The other gene, OsDWARF4, primarily controls brassinosteroid synthesis 
for leaf inclination (Fig. 20.29). Initially, a collection of transposon mutants 
of rice was screened for an "erect-leaf" phenotype, i.e., plants in which the 
leaves did not lean over but stood more or less straight up. One of the 
mutants that was selected was characterized in detail and found to have a 
mutation in the OsDWARF4 gene. Mutation of the other C-22 hydroxyla¬ 
tion gene or mutation of both genes yielded smaller plants that did not 
exhibit the erect-leaf phenotype. Since the mutation that conferred the 
erect-leaf phenotype on plants was created by transposon mutagenesis, it 
was relatively easy to isolate, and subsequently characterize, the trans- 
poson-labeled gene. Importantly, when plants that showed the erect-leaf 
phenotype were planted at a high density, they produced about 20% more 
rice than wild-type plants planted at the same density. This increase was 
attributed to the reduced shading (and hence the greater amount of photo¬ 
synthesis) of the lower leaves in the mutant (erect-leaf) plants. Flaving 
identified the gene whose activity is critical to increasing rice yield, it 
should now be possible to attenuate the activity of that gene using either 
antisense constructs or RNA interference. Moreover, it is expected that any 
approach that works in rice should be equally effective in other plants. 

Increasing Oxygen Content 

Oxygen is an essential substrate for plant respiratory metabolism, and in 
fact, the amount of available oxygen may limit a number of plant biochem¬ 
ical reactions. Since it is impractical to increase the external oxygen supply 


FIGURE 20.28 Chemical structure of brassinolide, with an arrow pointing to the 
position of C-22. 



O 







838 


CHAPTER 20 


Strain 

Genotype 

Phenotype 

Wild-type rice 

OsDWARF4 

OsDWARF4Ll 

100% size 

Leaves not erect 

Mutant rice 1 

OsD$RF4 

OsDWARF4L1 

95% size 

Leaves erect 

Mutant rice 2 

OsDWARF4 

OsD\^(RF4Ll 

~60% size 

Leaves not erect 

Mutant rice 1-2 

OsDt)(?lRF4 

OsDV£(RF4L1 

~10% size 

Leaves not erect 


FIGURE 20.29 Wild-type and mutant strains of rice with altered DWARF4 genes. 

to the plant, one way to increase the oxygen concentration inside a plant is 
to provide the plant with a protein that can sequester oxygen and provide it 
where it is needed. 

The gram-negative bacterium Vitreoscilla produces a dimeric hemo¬ 
globin that binds oxygen tightly and allows the bacterium to proliferate 
under oxygen-limited conditions. The gene encoding this hemoglobin has 
been isolated and expressed in a variety of bacteria, increasing the growth 
rate, the final cell density, and the yield of cloned foreign genes. When the 
Vitreoscilla hemoglobin under the transcriptional control of the 35S promoter 
was introduced into tobacco by A. tumefaciens -mediated gene transfer, trans¬ 
genic plants produced 80 to 100% more dry weight than nontransgenic 
plants, seed germination time was reduced from 6 to 8 days to 3 to 4 days, 
and transgenic plants contained approximately 35% more chlorophyll and 
34% more nicotine than nontransgenic plants. It is thought that the 
Vitreoscilla hemoglobin increases the availability of oxygen and/or energy in 
the cell. For example, the synthesis of 1 mol of chlorophyll from glutamate 
requires 4 mol of oxygen and 2 mol of ATP. Similarly, the last step in the 
synthesis of nicotine, the conversion of nicotinic acid and N-methyl-A'- 
pyrrolinium salt to nicotine, is catalyzed by the oxygen-dependent enzyme 
nicotine synthase (Fig. 20.30). While this sort of genetic manipulation raises 
more questions than it answers, the possibility of increasing plant yield by 
manipulating the oxygen concentration is quite exciting. 


Phytoremediation 

Phytoremediation is defined as the use of plants to remove, destroy, or 
sequester hazardous substances from the environment. Phytoremediation 
of metals and other inorganic compounds may take one of several forms: 

FIGURE 20.30 The last step in the biosynthesis of nicotine. 
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phytoextraction, the absorption and concentration of metals from the soil 
into the plant; rhizofiltration, the use of plant roots to remove metals from 
effluents; phytostabilization, the use of plants to reduce the spread of 
metals in the environment; or phytovolatilization, the uptake and release 
into the atmosphere of volatile materials, such as mercury- or arsenic-con¬ 
taining compounds. A number of plants that can naturally accumulate 
large amounts of metal have been identified. These plants are called hyper¬ 
accumulators. Unfortunately, in the presence of very high concentrations of 
metals, even hyperaccumulating plants attain only a small size. That is, 
high concentrations of metals are inhibitory to the growth of plants, even 
those plants that are capable of hyperaccumulating metals. Depending 
upon the amount of metal at a particular site and the type of soil at that site, 
it could take many years to completely remove the metal from the soil and 
remediate the site, even with hyperaccumulating plants. Since this is too 
slow for practical application, scientists have undertaken to engineer plants 
for more efficient metal phytoremediation. 

In general, plants sequester toxic pollutants in places where the toxi¬ 
cants can do the least harm to plant cellular processes. Thus, pollutants are 
typically accumulated in vacuoles or cell walls. The uptake and accumula¬ 
tion in leaves of inorganic contaminants without toxic effects are desirable 
properties for phytoextraction. To this end, plants can be engineered to 
have higher levels of transporters involved in the uptake of inorganic pol¬ 
lutants from the xylem into the leaf symplast and from the cytosol into 
vacuoles. 

To engineer plants that can accumulate greater amounts of lead and 
cadmium, toxic metals that contaminate a wide range of environments, the 
yeast gene YCF1 was used to transform A. thaliana plants. The yeast YCF 
protein is a member of the ABC transporter family of proteins and, when it 
is expressed in A. thaliana, detoxifies metals, which are taken up by the 
plant by transporting them to plant cell vacuoles (Fig. 20.31). Despite the 
fact that the YCF protein is expressed at a relatively low level in transgenic 
plants—the protein cannot be detected by immunological assays of plant 
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FIGURE 20.32 Detoxification of organomercural compounds by bacterial organomer- 
cural lyase (MerB) and mercuric ion reductase (MerA). 


cellular extracts—it can nevertheless effectively sequester metals into the 
vacuoles. Transgenic plants grown in the presence of high levels of either 
lead or cadmium have nearly four times as much metal in their vacuoles 
and about twice as much metal overall throughout the plant as nontrans- 
formed plants. This model system is an important first step in developing 
transgenic plants, such as poplar trees, that can remove metals from con¬ 
taminated soils. 

Organic forms of mercury (Hg), especially methyl mercury, are highly 
toxic to both plants and animals. In plants, these compounds inhibit elec¬ 
tron transport and photosynthesis; both are chloroplast functions. At the 
present time, there are no simple and inexpensive procedures for removing 
mercury from the environment. However, it is possible to engineer plants 
with bacterial genes that encode enzymes that can detoxify organic forms 
of mercury. Mercury-resistant bacteria detoxify organomercurals by pro¬ 
ducing two enzymes: organomercural lyase, which catalyzes the conver¬ 
sion of the organomercural to the less toxic inorganic species, Hg(II), and 
mercuric ion reductase, which catalyzes the reduction of Hg(II) to the vola¬ 
tile and less reactive elemental form, Hg(0) (Fig. 20.32). The enzymes are 
encoded by the merB and rnerA genes, respectively, of the plasmid-bome 
mercury resistance (mer) operon. Expressing the mer genes in plants by 
transformation of the nuclear genome provides some protection against the 
toxic effects of mercury. However, the operon containing both genes may 
be expressed in tobacco chloroplasts (in a single transformation event), 
obviating concerns about positioning effects, codon usage, or transmission 


FIGURE 20.33 Transgenic tobacco plants engineered to express mer genes in their 
chloroplasts produce more biomass than nontransgenic tobacco plants in the pres¬ 
ence of increasing amounts of mercury. 
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of the foreign genes to other plants via pollen. Moreover, chloroplast 
expression of the mer genes leads to a high level of protein expression and 
partially protects the plant from inhibition of electron transport and photo¬ 
synthesis (Fig. 20.33). Flaving demonstrated the efficacy of this approach 
with tobacco, it may be useful to fine tune the system to increase the ability 
of plants to detoxify mercury. Also, it may be advantageous to introduce 
the vier operon into other plants with more extensive root systems that are 
better suited to take up mercury from the soil. 

Phytoremediation of organic compounds may occur by phytostabiliza¬ 
tion; by phytostimulation, the stimulation of microbial biodegradation 
around the roots of plants; or by phytotransformation, the absorption and 
degradation of organic contaminants by the plant. A number of different 
types of plants are effective at stimulating the degradation of organic mol¬ 
ecules in the rhizosphere (i.e., around the plant roots). Typically, these 
plants, including many common grasses, as well as crop species, have 
extensive and fibrous roots, which form an extended rhizosphere. In addi¬ 
tion, several varieties of trees can take up and degrade some organic con¬ 
taminants. Plants with phytotransformation activity may contain 
nitroreductases, which are useful for degrading TNT (trinitrotoluene) and 
other nitroaromatics; dehalogenases for the degradation of chlorinated 
solvents and pesticides; and laccases that can degrade anilines, such as 
triaminotoluene. However, there is less incentive to engineer plants to be 
more efficient degraders of organic compounds, since (1) many plants can 
already do this effectively and (2) selecting and/or engineering soil 
microbes that live in the vicinity of plant roots may provide a simpler 
means to the same end. 


SUMMARY 


T he expression of foreign genes in plants makes it possible 
to produce a wide range of new plant varieties. Plants 
with new flower colors have been developed; the nutritional 
content of crops has been enhanced; discoloration of potatoes 
has been prevented by genetic manipulation; the sweetness of 
some plants has been augmented; plants have been developed 
to act as factories for the large-scale production of important 
foreign proteins, such as antibodies and therapeutics; and 
plant yield has been increased by increasing the iron and 
oxygen content, by modulating the lignin content, and by 
modifying the response of the plant to brassinosteroids. Also, 
plants have been utilized as components of phytoremediation 
protocols designed to remove contaminating metals and 
organic compounds from the environment. 


Plant nutritional content may be improved in a variety of 
ways. The amino acid content (specifically methionine and 
lysine) can be increased, the lipid composition can be modi¬ 
fied to suit the intended end use of the oil, pathways for the 
synthesis of vitamin E and the precursor to vitamin A have 
been engineered, and plants with increased levels of available 
iron have been created. 

As bioreactors, plants can produce, on a laboratory scale, 
monoclonal antibodies and antibody fragments; the polymer 
polyhydroxybutyrate, which can be used to make a biode¬ 
gradable plastic-like material; and a number of potential 
therapeutic agents. Finally, transgenic plants have also been 
used as edible vaccines, an approach that could result in a 
wide range of new and inexpensive vaccines. 
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REVIEW QUESTIONS 

1. How can plants be genetically manipulated to produce 
flowers with unusual colors? 

2. How can soybeans be genetically manipulated to increase 
their lysine content? 

3. How can vitamins be overproduced in rice plants? 


4. How would you engineer soybean plants to overproduce 
long-chain omega-3 and omega-6 fatty acids? 

5. What is the effect of increasing the level of oxidized gluta¬ 
thione within a plant? How would you genetically manipulate 
a plant to do this? 
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6. How can the level of bioavailable iron in plants be 
increased? 

7. How would you increase the amount of amylose compared 
to amylopectin in potato starch? How would you increase the 
amount of amylopectin compared to amylose? What is the 
advantage of performing these manipulations? 

8. How does an antigen expressed in a transgenic plant act as 
an edible vaccine? 

9. How would you engineer an edible vaccine directed 
against V. cholerae-c aused diarrhea? 

10. How can a plant's oxygen levels be manipulated? How 
does manipulation of a plant's oxygen levels affect the yield of 
that plant? 

11. How would you engineer rice plants to overproduce tetra- 
hydrofolate? Why would you do this? 

12. Why is it necessary to genetically engineer soybean plants 
to have a small amount of phytate in their seeds when low- 
phytate mutants may be selected following conventional 
mutagenesis? 


13. Briefly describe a vector system that may be used to engi¬ 
neer plants to produce large amounts of full-size IgG mole¬ 
cules. 

14. Why are plants an attractive host system, compared to 
bacteria and animal cells in culture, for the production of 
human therapeutic proteins? 

15. What is phytate? How can the levels of phytate in seeds be 
modulated? Why would you want to do this? 

16. Describe a strategy for developing a plant vaccine against 
type 2 Shiga toxin. 

17. How can the lignin content of trees be decreased? What is 
the benefit of this type of genetic manipulation? 

18. How can manipulation of the level of brassinosteroids be 
used to increase plant yield? 

19. How can plants be genetically modified to increase their 
effectiveness in the phytoremediation of certain metals? 
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W hen traditional breeding methods are used, many generations 
of selective matings are required to improve livestock and other 
domesticated animals genetically for traits such as milk yield, 
wool characteristics, rate of weight gain, and egg-laying frequency. At each 
successive generation, animals with superior performance characteristics 
are used as breeding stock. Eventually, high-production animals are devel¬ 
oped as more or less pure breeding lines. This combination of mating and 
selection, although time-consuming and costly, has been exceptionally suc¬ 
cessful. Today, almost all aspects of the biological basis of livestock produc¬ 
tion can be attributed to this process. However, once an effective genetic 
line has been established, it becomes difficult to introduce new genetic 
traits by selective-breeding methods. For example, a strain with a newly 
discovered, valuable gene may also carry deleterious genes that, after 
crossing, would diminish the existing genetically determined production 
levels. Thus, a completely new program of multigenerational crosses with 
rigorous selection procedures has to be initiated to ensure that a new 
breeding line retains both its original attributes and the new trait. 

Until recently, selective breeding was the only way to enhance the 
genetic features of domesticated animals. However, the combination of the 
successful transfer of genes into mammalian cells and the possibility of 
creating genetically identical animals by transplanting nuclei from somatic 
cells into enucleated eggs (nuclear transfer, or nuclear cloning) led 
researchers to consider putting single functional genes or gene clusters into 
the chromosomal DNA of higher organisms. Conceptually, the strategy 
used to achieve this end is simple. (1) A cloned gene is injected into the 
nucleus of a fertilized egg. (2) The inoculated fertilized eggs are implanted 
into a receptive female because successful completion of mammalian 
embryonic development is not possible outside of a female. (3) Some of the 
offspring derived from the implanted eggs carry the cloned gene in all of 
their cells. (4) Animals with the cloned gene integrated in their germ line 
cells are bred to establish new genetic lines. 
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TABLE 21.1 Protein compositions of milk from cattle and sheep 


Protein 

Concentration (g/liter) 

in milk from: 

Cattle 

Sheep 

Caseins 

a sl -Casein 

10.0 

12.0 

a s2 -Casein 

3.4 

3.8 

K-Casein 

3.9 

4.6 

p-Casein 

10.0 

16.0 

Major whey proteins 

a-Lactalbumin 

1.0 

0.8 

P-Lactalbumin 

3.0 

2.8 

Other proteins 

Serum albumin 

0.4 

Unknown 

Lysozyme 

Trace 

Unknown 

Lactoferrin 

0.1 

Unknown 

Immunoglobulins 

0.7 

Unknown 


This approach has many practical applications. If, for example, the 
product of the injected gene stimulates growth, animals that acquire this 
gene should grow faster and require less feed. An enhancement of feed 
efficiency by a few percent would have a profound impact on lowering the 
cost of production of either beef or pork. 

During the 1980s, with considerable effort, the idea of genetically 
manipulating animals by introducing genes into fertilized eggs was con¬ 
verted into reality. As with many new scientific enterprises, a set of terms 
was created to make communication easier. For example, an animal whose 
genetic composition has been altered by the addition of foreign (exoge¬ 
nous) DNA is said to be transgenic. The DNA that is introduced is called 
a transgene, and the overall process is called transgenic technology, or 
transgenesis. 

The genetic improvement of animals by the introduction of relevant 
transgenes is only slowly being realized. In the meantime, however, trans¬ 
genesis has become a powerful technique for studying fundamental prob¬ 
lems of mammalian gene expression and development, for establishing 
animal model systems for studying human diseases, for producing foreign 
proteins in bird eggs, and for using the mammary gland to produce phar¬ 
maceutically important proteins in milk. With this last application in mind, 
the term "pharming" was coined to convey the idea that milk from trans¬ 
genic farm ("pharm") animals can be a source of authentic human protein 
drugs or pharmaceuticals. There are a number of reasons why the mam¬ 
mary gland should be used in this way. Milk is a renewable, secreted body 
fluid that is produced in substantial quantities and can be collected fre¬ 
quently without harm to the animal. A novel drug protein that is confined 
to the mammary gland and secreted into milk should have no side effects 
on the normal physiological processes of the transgenic animal and should 
undergo posttranslational modifications that at least closely match those in 
humans. Finally, purification of a protein from milk, which contains only a 
small number of different proteins (Table 21.1), should be relatively 
straightforward. 
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FIGURE 21.1 Establishing transgenic mice with retroviral vectors. Cleavage stage 
embryos, usually at the eight-cell stage, are infected with a defective retrovirus 
carrying a transgene. Implanted females (foster mothers) give birth to transgenic 
pups. Matings are carried out to determine which pups have the transgene in their 
germ line cells. Transgenic lines can be established from these founder transgenic 
animals. 


Transgenic Mice: Methodology 

Transgenic technology has been developed and perfected in the laboratory 
mouse. Since the early 1980s, hundreds of different genes have been intro¬ 
duced into various mouse strains. These studies have contributed to an 
understanding of gene regulation, tumor development, immunological 
specificity, molecular genetics of development, and many other biological 
processes of fundamental interest. Transgenic mice have also played a role 
in examining the feasibility of the industrial production of human thera¬ 
peutic drugs by domesticated animals and in the creation of transgenic 
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strains that act as biomedical models for various human genetic diseases. 
For transgenesis, DNA can be introduced into mice by (1) retroviral vectors 
that infect the cells of an early-stage embryo prior to implantation into a 
receptive female, (2) microinjection into the enlarged sperm nucleus (male 
pronucleus) of a fertilized egg, or (3) introduction of genetically engineered 
embryonic stem cells into an early-stage developing embryo before implan¬ 
tation into a receptive female. 

The Retroviral Vector Method 

Of the various gene transfer methods, the use of retroviral vectors (Fig. 21.1) 
has the advantage of being an effective means of integrating the transgene 
into the genome of a recipient cell. Retroviruses have RNA genomes that are 
used as templates for reverse transcriptase to synthesize a DNA copy that 
can be inserted into the host cell genome (see chapter 11). However, vectors 
derived from these viruses can transfer only small pieces (~8 kilobases [kb]) 
of DNA that, because of the size constraint, may lack essential adjacent 
sequences for regulating the expression of the transgene. 

There is a further major drawback to the use of retroviral vectors. 
Although these vectors are designed to be replication defective, the genome 
of the retroviral strain (helper virus) that is needed to create large quantities 
of the vector DNA can be integrated into the same nucleus as the transgene. 
Despite special precautions, helper strain retroviruses could be produced 
by the transgenic organism. Consequently, for applications in which either 
a commercial product is to be synthesized by the transgenic organism or 
the transgenic organism is to be used as food, it is absolutely necessary that 
there be no retroviral contamination. In addition, transgenes introduced on 
some retroviral vectors are silenced in mouse embryos. 

Transgenes carried on vectors derived from lentiviruses (Fig. 21.2)—a 
group of retroviruses that includes human immunodeficiency virus and 
similar viruses from other animal cells—however, are not silenced in 
embryos. Moreover, these vectors are capable of delivering large segments 
of DNA into the host genome, are stable for relatively long periods, have 
low immunogenicity, and can infect both dividing and nondividing cells. 


FIGURE 21.2 A lentiviral transfer vector for introducing transgenes into animal cells. 
The transgene with an appropriate promoter sequence (p) is inserted into the lenti- 
viral vector. The transfer vector is introduced into a packaging cell line that pro¬ 
duces the viral proteins required for production of the viral RNA, including the 
transgene sequence, and for packaging the RNA into viral particles that will be 
used to deliver the transgene into animal cells. Long terminal repeats (LTR) at the 
5' and 3' ends of the vector are required for production of lentiviral RNA, and the 
packaging signal ( V F) is required for packaging the RNA into viral particles. 
Following infection of animal cells with lentivirus, lentiviral RNA is reverse tran¬ 
scribed, and the transgene is integrated into the animal cell genome via sequences 
in the LTRs. A polypurine tract sequence (PPT) and a woodchuck posttranscrip- 
tional regulatory element (WPRE) enhance the transduction of host cells and 
increase transgene expression in the animal cells. A regulatory element within the 
3' LTR is deleted (indicated by a black dot) to prevent the production of vector RNA 
from a promoter contained within the LTR following introduction into host cells. 
Expression of the transgene is not affected by the deletion because it is expressed 
from its own promoter. 
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FIGURE 21.3 Establishing transgenic mice by DNA microinjection. Eggs are obtained 
from donor females that have been induced to superovulate and then mated with 
males. Purified samples of the transgene construct are microinjected into the male 
pronucleus of a fertilized egg. Implanted females (foster mothers) give birth to 
transgenic pups, from which transgenic lines can be established. 


The last characteristic is an advantage for the expression of transgenes in 
neuronal, muscle, liver, and other nondividing cells. The lentivirus vector 
system is similar to other retroviral vector systems and is comprised of a 
transfer vector that carries the transgene and a packaging cell line that pro¬ 
vides viral proteins for packaging the viral particles (Fig. 11.16). Lentiviral 
vectors have been used successfully to introduce a variety of transgenes 
into embryonic cells or early embryos of mice, pigs, cattle, and birds, and 
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these genes are highly expressed either ubiquitously or in a tissue-specific 
manner depending on the promoter used. Lentiviruses are also often used 
as delivery vehicles to introduce a construct to decrease gene expression in 
mice and other organisms by RNA interference (RNAi) (see below). 

The DNA Microinjection Method 

Because of the disadvantages of the retroviral vector method, microinjec¬ 
tion of DNA is currently the preferred method for producing transgenic 
mice. This procedure is performed in the following way (Fig. 21.3). (1) The 
number of available fertilized eggs that are to be inoculated by microinjec¬ 
tion is increased by stimulating donor females to superovulate. Female 
mice are given an initial injection of pregnant mare's serum and another 
injection, about 48 hours later, of human chorionic gonadotropin. A supero- 
vulated mouse produces about 35 eggs instead of the normal 5 to 10. (2) 
The superovulated females are mated so that eggs become fertilized, and 
then they are killed. The fertilized eggs are flushed from their oviducts. (3) 
Microinjection of the fertilized eggs usually occurs immediately after their 
collection. The microinjected transgene construct is often in a linear form 
and free of prokaryotic vector DNA sequences. 

In mammals, after entry of the sperm into the egg, both the sperm 
nucleus (male pronucleus) and female nucleus are separate entities. After 
the female nucleus completes its meiotic division to become a female pro¬ 
nucleus, nuclear fusion (karyogamy) occurs. The male pronucleus, which 
tends to be larger than the female pronucleus, can be located by using a 
dissecting microscope. The egg can then be maneuvered, oriented, and 
held in place by micromanipulation while the DNA is microinjected. On a 
good day, several hundred male pronuclei can be inoculated. 

After inoculation, 25 to 40 eggs are implanted microsurgically into a 
foster mother that has been made pseudopregnant by being mated to a 
vasectomized male. In mice, copulation is the only known way to prepare 
the uterus for implantation. In this case, because the male mate lacks 
sperm, none of the eggs of the foster mother are fertilized. The foster 
mother will deliver pups from the inoculated eggs about 3 weeks after 
implantation. 

For identification of transgenic animals, DNA from a small piece of the 
tail can be assayed by either Southern blot hybridization or polymerase 
chain reaction (PCR) for the presence of the transgene. A transgenic mouse 
can be mated to another mouse to determine if the transgene is in the germ 
line of the founder animal. Subsequently, progeny can be bred with each 
other to form pure (homozygous) transgenic lines. 

The procedure, although apparently simple, requires the coordination 
of a number of experimental steps. Even a highly trained practitioner can 
expect, at best, only 5% of the inoculated eggs to develop into live trans¬ 
genic animals (Fig. 21.4). None of the steps in the procedure is 100% effi¬ 
cient; consequently, large numbers of microinjected fertilized eggs must be 
used. Furthermore, with this method, the injected DNA integrates at 
random sites within the genome, and often multiple copies of the injected 
DNA are incorporated at one site. Not all of the transgenic pups will have 
the appropriate characteristic. In some individuals, the transgene may not 
be expressed because of the site of integration, and in others, the copy 
number may be excessive and may lead to overexpression, which disrupts 
the normal physiology of the animal. Despite the overall inefficiency, it has 
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FIGURE 21.4 Overall efficiency of the transgenesis process after DNA microinjection. 
All the fertilized eggs (100%) of cattle, pigs, sheep, and mice are inoculated with a 
transgene, but the success of implantation and giving birth to offspring is much 
lower, and only 5% or fewer of the treated eggs become transgenic progeny. 


become routine to use DNA microinjection of the male pronucleus to create 
lines of mice carrying functional transgenes. 

The Engineered Embryonic Stem Cell Method 

Cells from the early, blastocyst stage of a developing mouse embryo can 
proliferate in cell culture and still retain the capability to differentiate into 
all other cell types—including germ line cells—after they are reintroduced 
into another blastocyst embryo. Such cells are called pluripotent embryonic 
stem cells. When in culture, embryonic stem cells can be readily engineered 
genetically without altering their pluripotency. With this system, for 
example, a functional transgene can be integrated at a specific site within a 
dispensable region of the genome of embryonic stem cells. The genetically 
engineered cells can be selected, grown, and used to generate transgenic 
animals (Fig. 21.5). In this way, the randomness of integration that is 
inherent in the DNA microinjection and retroviral vector systems is 
avoided. 

After transfection of embryonic stem cells in culture with a DNA vector 
that is designed to integrate within a specific chromosomal location, some 
cells will have DNA integrated at nontarget (spurious) sites, whereas in 
other cells, integration will occur at the target (correct) site. The target site 
should be located in a section of genomic DNA that encodes no essential 
products, so that after integration of the input DNA, there is no interference 
with any developmental or cellular functions. Moreover, it is essential that 
the transgene be integrated into a part of the genome that does not prevent 
it from being transcribed, for example, in euchromatin rather than hetero¬ 
chromatin (see chapter 7). In most of the embryonic stem cells, the input 
DNA will not be integrated at all. To enrich for the cells with DNA inte- 
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FIGURE 21.5 Establishing transgenic mice 
with genetically engineered embryonic stem 
(ES) cells. An embryonic stem cell culture is 
initiated from the inner cell mass of a mouse 
blastocyst. The embryonic stem cells are 
transfected with a transgene. After growth, 
the transfected cells are identified by either 
the positive-negative selection procedure or 
PCR analysis. Populations of transfected 
cells can be cultured and inserted into blas¬ 
tocysts, which are then implanted into foster 
mothers. Transgenic lines can be established 
by crosses from founder mice that carry the 
transgene in their germ lines. 
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grated at the target site, a procedure called positive-negative selection is 
implemented. This strategy uses positive selection for cells that have vector 
DNA integrated anywhere in their genomes and negative selection against 
the vector DNA sequence that is integrated at spurious sites. 
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A targeting DNA vector for the positive-negative selection procedure 
usually contains (1) two blocks of DNA sequences (HB1 and HB2) that are 
homologous to separate regions of the target site; (2) the transgene, which 
will confer a new function on the recipient; (3) a DNA sequence that codes 
for resistance to the compound G-418 (Neo r ); and (4) two different genes for 
thymidine kinase (tkl and tk2) from herpes simplex virus types 1 and 2 
(HSV-f kl and HSV-f/c2) (Fig. 21.6A). The arrangement of these sequences is 
key to the positive-negative selection procedure. Between the two blocks of 
DNA that are homologous to the target site are the genes for the transgene 
and G-418 resistance (Neo r gene). Outside of each of the homologous blocks 
are the genes HSV-f/cl and HSV-f/c2. If integration occurs at a spurious site, 
i.e., not at HB1 and HB2, either one or both of the HSV-f/c genes have a high 
probability of being integrated along with the other sequences (Fig. 21.6A). 


FIGURE 21.6 Positive-negative selection. (A) Result of nonspecific integration. Both 
genes for thymidine kinase (tkl and tkl), the two DNA sequences that are homolo¬ 
gous to a specific chromosomal region in the recipient cells (HB1 and HB2), a gene 
(Neo r ) that confers resistance to the cytotoxic compound G-418, and the transgene 
(TG) are incorporated into the chromosome. After transfection, cells are selected for 
resistance to both G-418 and the compound ganciclovir, which becomes cytotoxic to 
cells that synthesize thymidine kinase. Other nonhomologous integrations may 
occur and produce inserts with one or the other of the thymidine kinase genes. 
After treatment with G-418 and ganciclovir, all the cells with nonspecific integration 
of the input DNA that includes at least one of the thymidine kinase genes are killed. 
(B) Result of homologous recombination. The product of a double crossover 
between homologous blocks (HB1 and HB2) of DNA on the vector DNA and on 
chromosomal DNA does not contain either of the two thymidine kinase genes (tkl 
and tkl). After treatment with G-418 and ganciclovir, only cells that have undergone 
homologous recombination survive. 
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Alternatively, if the integration event is due to homologous recombination 
by a double crossover at the target site, the HSV-f/c genes are excluded and 
only the transgene and the Neo r gene are incorporated into the genome (Fig. 
21.6B). When transfected cells are grown in the presence of G-418, all the 
cells that lack the Neo r gene are killed. Therefore, only cells with integrated 
DNA survive; i.e., these cells are positively selected. If the compound gan¬ 
ciclovir is added at the same time as G-418, the cells that express thymidine 
kinase are killed because thymidine kinase converts ganciclovir to toxic 
compounds that kill cells; i.e., these cells are negatively selected. The cells 
most likely to survive this dual-selection scheme are those that have DNA 
integrated at the target site. Although not foolproof, the positive-negative 
selection method enriches an embryonic stem cell population for cells that 
carry a transgene at a specific chromosomal location. 

A more direct way to detect embryonic stem cells that carry a transgene 
at a targeted chromosomal site is to use PCR. The targeting DNA vector 
contains two blocks of DNA that are homologous to the target site, with 
one on either side of both the transgene and a cloned bacterial or synthetic 
(unique) DNA sequence that is not present in the mouse genome (Fig. 21.7). 
After the transfection of embryonic stem cells, the cells are pooled and 
samples are screened by PCR. One of the primers (PI) for PCR is comple¬ 
mentary to a sequence within the cloned bacterial or synthetic (unique) 
DNA sequence of the integrating vector. The other primer (P2) is comple- 


FIGURE 21.7 Testing for nonspecific integration and homologous recombination in 
transfected cells by PCR. (A) After nonspecific integration of the vector DNA, one 
of the primers (P2) is not able to hybridize to a chromosomal site that is a predeter¬ 
mined distance from the site of hybridization of PI, so a DNA fragment with a 
specific size is not amplified. PI hybridizes to a unique segment (US) of the input 
DNA that does not occur in the chromosomal DNA of the recipient cells. TG, trans¬ 
gene; HB1 and HB2, homologous blocks. (B) Homologous recombination between 
DNA sequences (HB1 and HB2) of the input DNA that are complementary to chro¬ 
mosomal sites (CS1 and CS2) creates hybridization regions for both PI and P2 that 
are a predetermined distance apart. Amplification by PCR generates a DNA frag¬ 
ment of a specific size that can be visualized by gel electrophoresis. In this case, the 
transgene (TG), which lies between the homologous blocks (HB1 and HB2), is inte¬ 
grated at a specific chromosomal location. 
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mentary to a DNA sequence that is part of the chromosome adjacent to the 
region of one of the homologous blocks of DNA. If integration is at a 
random site, the predicted amplified DNA product is not synthesized (Fig. 
21.7A). However, if site-specific integration occurs, the PCR amplifies a 
DNA fragment of known size (Fig. 21.7B). In this way, pools of cells with 
embryonic stem cells containing the desired gene at the targeted site can be 
identified. By subculturing from these pools, cell lines carrying the site- 
specific integration can be established. 

Embryonic stem cells carrying an integrated transgene can be cultured 
and inserted into blastocyst stage embryos, and these embryos can then be 
implanted into pseudopregnant foster mothers. Transgenic lines are estab¬ 
lished by mating the progeny that carry the transgene in their germ lines. 
Then, if required, littermates that carry a transgene in their germ lines are 
crossed to produce mice that are homozygous for the transgene. 

Not only can a transgene be inserted into a specific chromosome site by 
homologous recombination in embryonic stem cells to provide a new func¬ 
tion, but a specific mouse gene can also be targeted for disruption by the 
incorporation of a DNA sequence, usually a selectable marker gene, into its 
coding region (Fig. 21.8). One of the aims of targeted gene disruption (gene 
knockout) is to determine the developmental and physiological conse¬ 
quences of inactivating a particular gene. In addition, a transgenic line with 
a specific disabled gene can be used as a model system to study the molec¬ 
ular pathology of a human disease. 

For example, inactivation of the mouse rhodopsin gene by targeted 
gene disruption leads to deterioration of the rod cells of the retina in trans¬ 
genic mice that closely resembles the human disease retinitis pigmentosa. 
Thus, the progress of retinal degeneration and the effects of potential 
therapeutic agents that either delay or block the genetically induced retin¬ 
opathy have been studied by using the rhodopsin knockout mouse. 
Hundreds of different types of knockout mice have been created as animal 
models for the study of various human abnormalities. 


FIGURE 21.8 Gene disruption by targeted homologous recombination. The target 
vector carries a selectable marker gene (SMG) with flanking DNA sequences that 
are homologous to regions of the targeted gene. In this example, the targeted gene 
has five exons (1 to 5). Homologous recombination disrupts (i.e., knocks out) the 
targeted gene, p, promoter; pa, polyadenylation signal. 
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A _^ ^_ 

TATACGAAGTTAT 
ATATGCTTCAATA 

Repeat Spacer Repeat 


ATAACTTCGTATAGCATACAT 
TATTGAAGCATAT CGTATGTA 
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TATTGAAGCATAT CGTATGTA 


ATAACTTCGTATA 

TATTGAAGCATAT 


Repeat 


Spacer 
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FIGURE 21.9 A loxP site with the repeats in the opposite (A) and same (B) orientation. 
The arrows indicate the directions of the repeat sequences. 


Genetic Modification with the C re-loxP Recombination System 

Transgenic mice usually carry a transgene or gene modification (knockout 
gene) in all of their cells. However, it is helpful to have a process that selec¬ 
tively regulates the expression of a gene within a specific somatic tissue or 
cell type. The Cr e-loxP recombination system, which is derived from 
genetic elements of bacteriophage PI, has been adapted for this purpose. 

Bacteriophage PI is one of several tailed viruses that infect Escherichia 
coli. It has a double-stranded DNA genome that is about 100 kb in length. 
After introduction into £. coli, the linear PI genome forms a circle. The cir¬ 
cularized PI DNA acts as a template for replication, and depending on 
which set of genes is activated, the circular form is either maintained as a 
plasmid or used as a template for the production of viral genomes during 
the lytic cycle. On rare occasions, a circularized PI genome integrates into 
the E. coli chromosome. Circularization and integration of the PI genome 
are mediated by the product of the ere gene (circularization recombination 
[Cre recombinase] protein), which specifically cleaves and recombines the 
DNA of loxP (locus of crossing over [x] in PI) sites. 

A loxP site consists of two 13-base-pair (bp) inverted repeats that are 
separated from each other by an 8-bp spacer sequence (Fig. 21.9). Briefly, 
Cr e-loxP recombination entails the coming together of two remote loxP 
sites, each of which has two bound Cre recombinase molecules; cleavage by 
the Cre recombinase within the spacer regions between the repeat sequences; 
and the exchange and joining of DNA strands to form recombined DNA 
molecules. The outcome of the recombination event depends on the orienta¬ 
tion of the repeats of the loxP sites (Fig. 21.10). If the repeats are in opposite 
directions, then the exchange inverts the DNA between the two loxP sites 
(Fig. 21.10A). If the repeats are in the same orientation, then the intervening 
sequence is excised (Fig. 21.10B). The repeat elements of bacteriophage PI 
are naturally in opposite orientations. The Cr e-loxP recombination system 
can function when the loxP sites are widely separated. For example, the two 
loxP sites that are essential for circularization of a PI genome are about 100 
kb apart. The specificity of the Cr e-loxP system is absolute, because the Cre 
recombinase acts exclusively on loxP sites. 

On the basis of these features, the Cr e-loxP system was developed for 
producing cell-specific gene modifications in mouse cells. As a first step in 
the overall strategy, the cre gene is isolated and placed under the control of 
a cell-specific promoter. Transgenic mice with the cre gene construct are 
established, and the tissue specificity of the Cre activity is confirmed. Next, 
a loxP site with repeat sequences in the same direction is inserted on either 
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FIGURE 21.10 The outcome of Cre recombinase-mediated recombination depends on 
the relative orientation of the loxP sites. (A) Recombination between loxP sites in 
opposite orientations results in inversion of the DNA sequence between the two 
loxP sites. (B) Recombination between loxP sites in the same orientation results in 
deletion of the DNA sequence between the two loxP sites. 


side of a cloned DNA sequence, such as a cloned exon (Fig. 21.11). The 
construct is integrated into a chromosome site of embryonic stem cells by 
homologous recombination. These cells are selected, cultured, and used to 
establish a transgenic mouse line. Then, a transgenic mouse with the tissue- 
specific cre transgene is mated with a transgenic mouse with the integrated 
/oxP-flanked sequence. The DNA between the two loxP sites is deleted after 
the cre transgene is expressed in double-transgenic organisms (Fig. 21.11). 
In this way, the biological consequences of the loss of activity of a gene in 
a specific tissue can be monitored. The Cr e-loxP recombination system can 
also be used to activate a transgene in a specific tissue. In this case, the 
sequence between the loxP sites prevents transcription. This construct is 
inserted between the promoter and the coding sequence of a transgene. 
When Cre is expressed in mice with the integrated construct, the DNA 
sequence that blocks transcription is excised, thereby enabling the expres¬ 
sion of the transgene (Fig. 21.12). By controlling expression of Cre, for 
example, by adding an inducer that activates the cre promoter to the 
drinking water of mice, the expression of a transgene can be controlled. 

The Cr e-loxP technology has been used extensively to study the bio¬ 
logical consequences of tissue-specific gene inactivation with the goal of 
establishing models for human diseases. For example, selective removal of 
the kinesin II gene, which is expressed exclusively in retinal photoreceptor 
cells, leads to an accumulation of opsin and arrestin and eventually to cell 
death. This result mimics aspects of inherited retinitis pigmentosa in 
humans and is used for detailed studies of the pathophysiological effects 
on the retina. 
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FIGURE 21.11 Cr e-loxP recombination system for inactivating a gene in a specified 
cell type. (1) The ere gene is placed under the control of a cell-specific promoter (p cs ) 
and established as a transgene in a line of mice. (2) A loxP site is cloned on either 
side of an exon. The construct with the loxP sites is introduced into a chromosomal 
site of embryonic stem cells by homologous recombination, and a transgenic mouse 
line is established with these cells. The two transgenic lines are crossed, p, promoter. 
(3) In cells where both constructs are present, the Cre recombinase (beige circle) is 
synthesized, and two Cre molecules bind to each loxP site (dashed arrow). (4) The 
loxP sites undergo recombination (x), leading to the excision and circularization of 
a loxP site and an exon (Exon 2) that is eventually degraded and the formation of 
an inactivated gene that is retained in the chromosome. The right-angled arrow 
denotes transcription. 


Large chromosomal aberrations, such as deletions, can also be created 
with the Cr e-loxP system. In humans, a large deletion within chromosome 
22 is associated with DiGeorge syndrome (DGS), which has cardiovascular 
dysfunction as a significant characteristic. It is not known whether DGS is 
due to the loss of a large number of genes or a few major ones. To deter¬ 
mine the basis of DGS, a large deletion of the mouse chromosome that is 
comparable (syntenic) to human chromosome 22 was generated with the 
Cr e-loxP excision strategy. The mice with this deletion had symptoms that 
resemble DGS. Moreover, when a transgene for a cardiovascular-specific 
transcription factor from this region was introduced into these mice, the 
DGS-like effects were partially overcome, which suggests that the loss of 
this gene in humans plays a key role in DGS. 
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FIGURE 21.12 Cr e-loxP recombination system for activating a transgene in a specified 
cell type. (1) The ere gene is placed under the control of a cell-specific promoter (p cs ) 
and established as a transgene in a line of mice. (2) A piece of DNA with loxP sites 
that flank a transcription termination sequence (hatched box) is cloned between a 
promoter (p) and the first exon (Exon 1) of a gene. (3) The construct with the loxP 
sites is introduced into a chromosomal site of embryonic stem cells by homologous 
recombination, and a transgenic mouse line is established with these cells. The two 
transgenic lines are crossed. In cells where both constructs are present, the Cre 
recombinase (beige circle) is synthesized, and two Cre molecules bind to each loxP 
site (dashed arrow). (4) The loxP sites undergo recombination (x), leading to the 
excision and circularization of a loxP site and a transcription termination sequence 
that is eventually degraded and the formation of a transcriptionally active trans¬ 
gene that is retained in the chromosome. The right-angled arrows denote transcrip¬ 
tion, and the parallel vertical bars indicate termination of transcription. 


RNA Interference 

There are two main methods to silence gene expression in animal cells to 
study biological processes. One method abolishes expression of a gene by 
targeted disruption through homologous recombination in embryonic 
stem cells (knockout method), and the other decreases the expression of a 
target gene (knockdown method) by preventing messenger RNA (mRNA) 
translation using RNAi. The latter method exploits a natural mechanism 
for regulation of gene expression by endogenous RNA molecules in ani¬ 
mals and plants and for protecting cells against exogenous RNA molecules 
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from invading viruses. In RNAi, double-stranded RNA is recognized by a 
ribonuclease (RNase) called Dicer that cleaves the RNA into smaller 
double-stranded RNA molecules known as small interfering RNA (siRNA). 
A large nuclease complex called RISC (RNA-induced silencing complex) 
separates the strands of siRNA, and the single-stranded RNA products, 
together with RISC, bind to homologous sequences on mRNA molecules 
(Fig. 11.13). The nuclease component of RISC then degrades the mRNA, 
which prevents the encoded protein from being synthesized. Short endog¬ 
enous RNAs (micro-RNAs) transcribed from regions of the genomes of 
animals (and plants) are also recognized by Dicer and RISC and block 
translation of a target mRNA. 

To create transgenic mice with reduced expression of a target gene, a 
small region of the target sequence is cloned into a vector as an inverted 
repeat separated by a short spacer region. The RNA transcript that is pro¬ 
duced from this sequence forms a short (19- to 21-bp) hairpin RNA (small 
[or short] hairpin RNA [shRNA]) due to intramolecular basepairing (Fig. 
21.13). Shorter sequences are used to avoid a general downregulation of 
translation that is often elicited with longer sequences (thought to occur as 
part of the viral defense response). The construct is introduced into mouse 
embryonic stem cells by pronuclear injection or on a lentiviral vector. Stable 
mouse lines have been generated that produce an shRNA that is processed 
by the animal's Dicer and RISC proteins into siRNA to reduce the expres¬ 
sion of a target gene. This technique has been applied to a variety of ani¬ 
mals, including cows, pigs, goats, frogs, and rats. 


FIGURE 21.13 RNAi to knock down expression of a target gene in transgenic mice. A 
transgenic construct encoding an shRNA to target specific mRNA for degradation 
and a green fluorescent protein marker (gfp), each under the control of its own pro¬ 
moter (p) (the direction of transcription is indicated by black arrows), is introduced 
into mouse embryonic stem cells by pronuclear injection or on a lentiviral vector. 
Transgenic mice are identified by production of green fluorescent protein (GFP). 
Transcription of the transgene encoding short inverted repeats (blue arrows) sepa¬ 
rated by a spacer sequence yields an shRNA that is processed by the host cell Dicer 
and RISC proteins to reduce expression of a target gene. 
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FIGURE 21.14 Conditional knockdown of target gene expression using RNAi. A 
blocking sequence, flanked by loxP sites in the same orientation, is inserted between 
the sequence encoding the shRNA and the promoter that controls expression of the 
hairpin RNA (p). Cre recombinase-mediated recombination between loxP sites 
excises the blocking sequence and restores expression of the hairpin RNA. 


In cases in which reduced expression of a target gene might impair the 
growth and development of the animal, the timing of expression of the 
shRNA can be controlled. For conditional knockdown, a blocking sequence 
is inserted into the vector between the sequence encoding the shRNA and 
the promoter that controls expression of the hairpin RNA. The blocking 
sequence carries a termination signal for RNA polymerase and can also 
contain a marker gene, such as gfp, encoding green fluorescent protein 
under the control of its own promoter. The blocking sequence is flanked by 
two loxP sites in the same orientation and therefore is excised following the 
induction of Cre recombinase. Excision of the blocking sequence restores 
expression of the hairpin RNA (Fig. 21.14) and downregulation of the 
target gene in the mouse genome. 

Transgenesis with High-Capacity Vectors 

Generally, transgenes are complementary DNAs (cDNAs), small genes 
(less than 20 kb), or parts of genes. Often, cDNAs are poorly expressed in 
mammalian cells. Also, when a segment of genomic DNA is used for trans¬ 
genesis, important gene-specific regulatory sequences that lie either 
upstream or downstream of the gene are rarely retained as part of the 
insert. Moreover, complete genes and multigene complexes are too large 
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Genetic Transformation of Mouse Embryos by 
Micro injection of Purified DNA 

J. W. Gordon, G. A. Scangos, D. J. Plotkin, J. A. Barbosa, 
and F. H. Ruddle 

Proc. Natl. Acad. Sci. USA 77:7380-7384, 1980 

Somatic Expression of Herpes Thymidine Kinase in 
Mice following Injection of a Fusion Gene into Eggs 

R. L. Brinster, H. Y. Chen, M. Trumbauer, A. W. Senear, R. Warren, 
and R. D. Palmiter 
Cell 27:223-231, 1981 


G ordon et al. were the first to 
show the feasibility of DNA 
transfer by microinjection into 
the pronucleus of the mouse egg. In 
their study, the procedure was tested 
by microinjecting several hundred 
eggs with a vector-gene construct that 
consisted of pBR322 carrying both the 
HSV thymidine kinase gene and a 
piece of the simian virus 40 genome. 
Of the 78 offspring that were obtained 
from the surrogate mothers, 2 retained 
some plasmid DNA. The authors con¬ 
cluded, "These data demonstrate that 
it is possible to use a recombinant 
plasmid as a vector to transfer foreign 
genes directly into mouse embryos, 


and that these embryos can maintain 
the foreign genes throughout develop¬ 
ment." Unfortunately, the plasmid 
DNA was not intact, and the HSV thy¬ 
midine kinase sequence did not 
become a transgene. 

On the other hand, Brinster et al., 
who microinjected the pronuclei of a 
number of mouse eggs with a plasmid 
carrying the HSV thymidine kinase 
gene under the control of the pro¬ 
moter of the metallothionein I gene, 
found that one of their transgenic 
mice expressed HSV thymidine kinase 
at a high level in its liver and kidneys 
in comparison to three other trans¬ 
genic mice that produced low levels of 


this enzyme. Also, eight other trans¬ 
genic animals carried the HSV thymi¬ 
dine kinase sequence but did not 
produce any active HSV thymidine 
kinase. Southern blot analysis showed 
that all of the transgenic mice con¬ 
tained multiple copies of the microin¬ 
jected DNA. 

These two studies laid the founda¬ 
tion for transgenesis of mice. Despite 
the technical complexity and relative 
inefficiency of the microinjection 
strategy, it has been exceptionally suc¬ 
cessful. Currently, scores of strains of 
mice with either foreign genes (trans¬ 
genic mice) or endogenous genes that 
have been disrupted by the insertion 
of foreign DNA (knockout mice) are 
being used for studying gene regula¬ 
tion, mammalian development, viral 
pathogenesis, cancer, toxicology, and 
the mutagenicities of various agents, 
among other things. In addition, trans¬ 
genic and knockout mice are of con¬ 
siderable biomedical importance as 
model systems for human diseases. 


for conventional vectors. For these reasons, high-capacity vectors that carry 
genomic DNAs ranging in size from 100 to more than 1,000 kb have been 
developed for transgenesis. These vectors have been derived from bacte¬ 
rial, PI bacteriophage-derived, mammalian, and yeast (YACs) artificial 
chromosomes. A number of transgenic mice have been produced by micro¬ 
injection of the pronucleus of the fertilized egg or transfection of embryonic 
stem cells with YACs (described in chapter 7) that carry either an array of 
related genes or a single large gene. These organisms have been used to 
study developmental processes, as models for human disorders, and for 
the production of human therapeutic agents. 

The production of mice that synthesize only human antibodies is 
another noteworthy example of YAC transgenesis. In theory, monoclonal 
antibodies can be effective agents for diminishing the proliferation of 
cancer cells and as a means of treating other human diseases. However, it 
is impossible to generate human monoclonal antibodies routinely. Also, 
unfortunately, rodent monoclonal antibodies are immunogenic to humans 
and elicit anti-mouse antibodies that result in destruction of the therapeutic 
antibody and sometimes allergic reactions. Recombinant DNA strategies 
have been devised to "humanize" existing rodent monoclonal antibodies. 

An antibody is a tetrameric protein with two pairs of dissimilar chains. 
One of the chains is called the heavy chain, and the other is a light chain. 
The terms "heavy" and "light" refer to the difference in the molecular 
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masses of the antibody subunits. The genetic information for a specific 
heavy chain is created by rearrangement of several heavy-chain-specific 
DNA segments in a B cell (an antibody-producing cell). In addition, there 
are two different types of antibody light chains that are encoded after DNA 
rearrangements of other, light-chain-specific DNA segments. Each single B 
cell synthesizes only one kind of antibody molecule that has a unique set 
of rearranged segments for a heavy chain and a light chain. 

The genetic repertoire for the formation of the vast numbers of dif¬ 
ferent human antibodies consists of more than 100 heavy-chain DNA seg¬ 
ments and a similar number of light-chain DNA segments. Each heavy- and 
light-chain gene locus is about 1 to 1.5 megabases in length. To create a 
transgenic mouse that is capable of synthesizing a full range of human 
antibodies against every antigen, the endogenous mouse heavy- and light- 
chain genes were inactivated, and YACs carrying most of the heavy- and 
light-chain DNA elements from each human immunoglobulin gene were 
inserted into the chromosomal DNA of the mouse (Fig. 10.28). A commer¬ 
cialized version of the human antibody-producing mouse has been desig¬ 
nated the XenoMouse, and the first fully human monoclonal antibody 
produced in this mouse (panitumumab) has received regulatory approval 
for use as a treatment for advanced colorectal cancer. Clinical trials have 
shown that panitumumab is an effective control agent for colorectal cancer 
and does not elicit production of anti-panitumumab antibodies. The devel¬ 
opment of this cancer treatment took approximately 15 years to regulatory 
approval, including the development of the XenoMouse. Other therapeutic 
antibodies produced in the XenoMouse, including several for the treatment 
of various cancers and osteoporosis, are now in clinical trials. 


Transgenic Mice: Applications 

Transgenic mice can be used as model systems for determining the bio¬ 
logical basis of human diseases and devising treatments for various condi¬ 
tions. In addition, transgenesis of mice is an exemplary system for proving 
whether the production of a potential therapeutic agent is feasible. Whole- 
animal models simulate both the onset and progression of a human dis¬ 
ease. However, a mouse is not a human, even though it is a mammal, and 
so the information gathered from some transgenic models may not always 
be medically relevant. In other instances, however, critical insights into the 
etiology of a complex disease can be gained. With this in mind, mouse 
models for human genetic diseases, such as Alzheimer disease, amyo¬ 
trophic lateral sclerosis, Huntington disease, arthritis, muscular dystrophy, 
tumorigenesis, hypertension, neurodegenerative disorders, endocrinolog¬ 
ical dysfunction, and coronary disease, as well as many others, have been 
developed. 

Transgenic Disease Models: Alzheimer Disease 

Alzheimer disease is a degenerative brain disorder that is characterized by 
the progressive loss of both abstract thinking and memory and is accompa¬ 
nied by personality change, language disturbances, and a slowing of phys¬ 
ical capabilities. Clinical diagnosis of Alzheimer disease is poor, although 
1% of the population between 60 and 65 years of age and 30% of the popula¬ 
tion over 80 years of age may develop it. Neurofibrillary tangles accumulate 
within the cell bodies of the neurons, dense extracellular aggregates called 
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Senile plaques 


FIGURE 21.15 Schematic representation 
of a neuron of the human cerebral 
cortex showing some of the histopatho- 
logical features of Alzheimer disease. 
Senile plaques containing amyloid 
deposits and apparent cellular debris 
accumulate at the synaptic junction of a 
neuron. Within the cell body of a 
neuron, neurofibrillary tangles contain 
aggregated cytoskeletal and other pro¬ 
teins. Other changes that occur in 
affected neurons are not depicted. 


senile plaques develop at the ends of inflamed nerves (neuritis), and brain 
cells (neurons) are lost in the neocortex and hippocampus of the brain in 
patients with Alzheimer disease (Fig. 21.15). The core of a senile plaque is 
composed of a closely packed, fibrillar structure that traditionally has been 
called an amyloid body. Originally, amyloid bodies were thought to be 
made up of carbohydrates, but more definitive analysis established that 
they are protein aggregates. However, despite the misnomer, the term 
"amyloid" has been retained. 

The principal protein of Alzheimer disease amyloid bodies is a 4-kilo- 
dalton protein called A(3 (amyloid (3, (3-protein, (3-amyloid protein, or (3/ 
A4). The A(3 protein ranges in length from 39 to 42 amino acid residues; the 
A(340 and A(342 forms are the main variants. All A(3 proteins are derived 
from an internal proteolytic cleavage of the (3-amyloid precursor protein 
(APP). Faulty cleavage of the APP protein causes the production of A(340 
and A(342, and inefficient clearance of the variants likely leads to their accu¬ 
mulation. A small number of families with a high incidence of Alzheimer 
disease have mutations in the APP gene, a finding that implicates this gene 
in the disorder. Unfortunately, for the most part, it is impossible to study 
the onset and pathogenesis of Alzheimer disease in human subjects. 
Accordingly, an animal model that mimics Alzheimer disease is an invalu¬ 
able research tool. 

Mouse models for Alzheimer disease were created with transgenes that 
contain mutations in the APP gene that occur in some families with a high 
incidence of early onset (before 50 years of age) of Alzheimer disease. In 
one set of these families, site 717 of APP (APP-717) contains phenylalanine 
instead of valine. In another group of families with Alzheimer disease, sites 
670 and 671 of APP (APP-670/671) contain asparagine and leucine instead 
of lysine and methionine. 

A transgene with the APP-717 mutation was constructed from an APP 
cDNA. Modified introns were added between exons 6 and 7, 7 and 8, and 
8 and 9 of the APP cDNA. The introns were introduced into the APP cDNA 
because experiments have shown that transgenes with introns have an 
increased rate of transcription in comparison to constructs without them. 
The "APP cDNA-intron" construct is controlled by the platelet-derived 
growth factor (3 promoter that is expressed in brain tissue (Fig. 21.16). The 
complete construct is called the PD APP minigene. Aging transgenic mice 
(more than 6 months old) with about 40 copies of the PD APP minigene 
display amyloid plaques, neuronal cell death, and memory defects. An 
APP-670/671 gene construct that is driven by a brain-specific promoter also 
produces transgenic mice with Alzheimer disease-like features, including 
an excess of A(342. Interestingly, neither aging PD APP minigene mice nor 
APP-670/671 transgenic mice have neurofibrillary tangles. Possibly, these 
structures are a secondary response to the overproduction of A(342 in 
humans. 

The formation of amyloid plaques in humans has also been shown to 
be associated with increased production of the protein BACE1 ((3-site APP- 
cleaving enzyme 1), one of the proteases that cleaves APP to produce A(3. 
Transgenic mice that produce A(3 but also carry a knockout mutation in 
BACE1 do not develop A(3 amyloid plaques. However, BACE1 knockout 
mice exhibit deleterious behavioral defects, which indicates that some 
BACE1 is required for normal development and/or normal adult brain 
activity. RNAi to reduce, but not abolish, production of BACE1 may there¬ 
fore represent an attractive treatment to reduce or delay Alzheimer disease. 
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FIGURE 21.16 DNA construct for modeling Alzheimer disease in transgenic mice. The 
exons of the APP cDNA are marked by numbers (1 to 18), and the introduced 
introns are indicated by letters (A to C). The regulatory sequences are the platelet- 
derived growth factor |3 (PDGF-p) promoter and simian virus 40 poly(A) sequence. 
The dot marks the exon with the APP-717 mutation. The construct is called the 
PD APP minigene. 

This strategy is currently being tested in mouse models using mice that 
carry transgenes encoding mutant forms of APP that are genetically linked 
to familial Alzheimer disease in humans. shRNAs that target BACE1 
mRNA were carried on a lentiviral vector that was injected into the hip¬ 
pocampus (the region of the brain where amyloid plaques associated with 
Alzheimer disease are typically observed) in transgenic mice (Fig. 21.17). 
The mice injected with the RNAi construct showed a 38% reduction in the 
A(3 deposits and plaque formation within 1 month of injection. While there 
is much to learn about Alzheimer disease, the availability of animal models 
has helped to elucidate the molecular basis, and revealed some potential 
targets for treatment, of a disorder that affects about 4 million people in the 
United States at an annual cost of around $100 billion. 

Using Transgenic Mice as Test Systems 

The testing of RNAi in transgenic mice as a potential therapy to reduce 
levels of proteins that contribute to Alzheimer disease is one of many 
examples of the utility of transgenic laboratory animals in testing strategies 
to treat human genetic diseases. Transgenic mice have also been used to 
test the efficacy of strategies to protect animals against infectious diseases. 
One recent illustration is the development of transgenic mouse lines that 
express a soluble form of a porcine membrane receptor that could protect 
pigs against pseudorabies virus infection. 

Pseudorabies virus is an alphaherpesvirus that infects pigs and causes 
major economic losses to pig producers. Viral infection can result in 
encephalitis and respiratory illness in young pigs and abortion and infer¬ 
tility in sows. Although vaccines are available and provide some protec¬ 
tion, pseudorabies remains endemic in most regions of the world. An 
alternative approach has been proposed to produce transgenic pigs that 
resist pseudorabies virus infection or block replication of the pseudorabies 
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FIGURE 21.17 Lentiviral vector to knock down BACE1 expression in mouse models of 
Alzheimer disease. A sequence encoding the inverted repeats (blue arrows) of an 
shRNA targeting BACE1 mRNA was inserted into the lentiviral vector downstream 
of the mouse RNA polymerase III promoter (p U6 ). The gene for green fluorescent 
protein (gfp) under the control of the cytomegalovirus promoter (p CMV ) is used as a 
reporter gene to identify infected mouse cells. LTR, long terminal repeat (at the 5' 
and 3' ends of the vector); 4', viral packaging signal; PPT, polypurine tract sequence; 
WPRE, woodchuck posttranscriptional regulatory element for enhancing transduc¬ 
tion and expression of the transgene in host cells. 
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virus genome. One proposed strategy is to block the entry of the virus into 
host cells by expressing a soluble form of the host cell receptor to which the 
virus normally attaches. It was hypothesized that expression of a soluble 
form of the receptor would prevent the virus from binding to the host 
membrane-bound receptor, the step that precedes viral penetration of the 
host cell. Several porcine alphaherpesvirus receptors were identified, and 
one promising receptor, nectin-1, provided protection to transformed cell 
lines challenged with the pseudorabies virus and exhibited broad speci¬ 
ficity against alphaherpesviruses. Before the creation of transgenic farm 
animals, nectin-1 was tested for its ability to protect against pseudorabies 
infection in a mouse model. 

The DNA sequence encoding the extracellular domain of the nectin-1 
receptor was fused to the gene for the constant (Fc) region of human immu¬ 
noglobulin G (IgG) and placed under the control of a promoter that enabled 
expression of the fusion protein in several cell types (Fig. 21.18). The fusion 
construct was designed to produce a secreted form of the nectin-1 receptor 
with enhanced stability and immunogenicity that would promote removal 
of the virus by the host immune system. When transgenic mice expressing 
the fusion protein were exposed to pseudorabies virus at 20 times the dose 
normally required to kill 50% of the animals, through intraperitoneal injec¬ 
tion, 98% survived, in contrast to less than 10% of nontransgenic mice. 
Moreover, antibodies against the virus were not detected in the transgenic 
mice, which indicated that they were not infected. A similar level of protec¬ 
tion was observed when the pseudorabies virus was inoculated through 
the nasal passage, which is the normal route of exposure in pigs; the epi¬ 
thelial cells lining the nasal passage and respiratory tract produced the 
nectin-l-IgG fusion protein. These results demonstrate that expression of a 
secreted form of the pseudorabies receptor in transgenic mice can protect 
the animals against viral infection. Nectin-1 is known to play a role in sev¬ 
eral important cell functions; however, transgenic mice expressing the 
protein appeared normal. Although expression of the porcine receptor in 
transgenic mice provided effective protection against the virus, it remains 
to be seen if the same level of protection is provided by expression of the 
soluble form of the receptor protein in pigs. 

Conditional Regulation of Transgene Expression 

Various protocols have been devised for turning on and off the expression 
of a transgene in a specific cell type at will. Of these methods, the tetracy- 


FIGURE 21.18 Construct used to generate transgenic mice protected against pseudo¬ 
rabies virus infection. The extracellular domain of the porcine nectin-1 receptor was 
fused to the constant region of the human IgG protein (IgG-Fc) to create a stable, 
soluble form of the receptor that, when expressed in transgenic animals, would 
prevent entry of the pseudorabies virus into host cells. The cDNA sequence 
encoding this fusion protein was expressed under the control of the CAG promoter 
( p CAC ), which includes the chicken p-actin promoter and the cytomegalovirus 
enhancer sequence. The polyadenylation signal (pA) from the rabbit p-globin gene 
was also included at the 3' end of the construct. The DNA fragment was microin- 
jected into mouse eggs to generate transgenic founder mice. Adapted from Ono et 
al., Proc. Natl. Acad. Sci. 101:16150-16155, 2004. 
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cline-inducible system has been used extensively. This system is based on 
two transcription units in the same cell, with the product of one of the units 
determining the expression of a gene(s) of the other unit. In one form of the 
tetracycline-inducible system, the addition of doxycycline, a nontoxic ana¬ 
logue of tetracycline, turns off the expression of a transgene. Without doxy¬ 
cycline, the transgene is continuously expressed in a specific cell type. This 
"tet-off" system depends on the production of a chimeric (hybrid) protein 
composed of the tetracycline repressor and an amino acid sequence that 
activates the transcription process. The hybrid protein is called tetracycline- 
controlled transactivator (tetracycline transactivator, or tTA). The gene 
(tTA) that encodes tTA is under the control of a cell-specific promoter. The 
promoter that drives the transgene consists of a set of tetracycline operator 
(tetO) sequences upstream from a strong eukaryotic promoter. The tTA 
protein binds to the tetO region and activates the transcription of the trans¬ 
gene. The binding of the tTA protein to the fefO-promoter region is abso¬ 
lutely required for the initiation of transcription. On the other hand, when 
doxycycline is present, it binds to the tTA protein, and the doxycycline-tTA 
complex cannot bind to the tetO -promoter sequence, so transcription of the 
transgene does not occur. Thus, the presence or absence of doxycycline acts 
as a switch whenever and wherever the cell-specific promoter of the tTA 
gene is active (Fig. 21.19A). 

A reverse system, called "tet-on," in which doxycycline must be present 
for transcription of the transgene, has also been devised. In this form, the 
nucleotide sequence for the tetracycline repressor carries mutations that 
prevent the combined repressor protein and transactivator from binding to 
the fefO-promoter sequence. This tetracycline repressor/transactivator 
protein is designated rtTA (reverse tetracycline-controlled transactivator). 
However, doxycycline binds to rtTA and changes its configuration, which 
allows the complex to attach to the tetO -promoter sequence and initiate 
transcription of the transgene (Fig. 21.19B). 

Both transcription units of a tetracycline-regulatory system can be 
incorporated into a single plasmid; this reduces the number of steps that 
are required for the production of transgenic mice. Doxycycline is admin¬ 
istered by adding it to the drinking water of the mice. The tet-off and tet-on 
systems have innumerable uses. For example, the biological consequences 
of the production of a defective protein or overexpression of a normal pro¬ 
tein can be examined in detail, cell-specific disease conditions can be simu¬ 
lated, and gene-based treatments for diseases that affect a particular cell 
type can be tested. 

A fascinating example of the utility of the tetracycline-regulatory 
system is the development of a mouse model for Huntington disease, a 
disease that normally occurs only in humans. This incurable, fatal neuro¬ 
logical disorder affects about 1 in 10,000 people worldwide. The symptoms 
in most cases become evident when the patient is about 45 years old. 
Initially, muscle coordination is impaired. The disorder is progressive and 
unremitting. Eventually, both voluntary and involuntary movements 
become uncontrolled, speech is slurred, and severe psychiatric conditions 
appear. In the late stage of Huntington disease, the patients are mute, cog¬ 
nitively nonfunctional, and immobilized. The disease often lasts about 15 
years from the time of onset. The neurological damage is confined to spe¬ 
cific regions of the brain. At the genetic level, the alteration that is respon¬ 
sible for Huntington disease is the addition of CAG units to an existing 
sequential array of these trinucleotides in exon 1 of the HD gene, which 
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FIGURE 21.19 Schematic representation of tetracycline-regulated gene expression. 

(A) Tet-off system. The tTA gene is driven by a cell-specific promoter (p cs ) and 
encodes a tetracycline repressor sequence (red) and a transcription activator (blue). 
The product of the tTA gene is a protein (tTA [purple circle]) that binds to a tetracy¬ 
cline operator (fef0)-eukaryotic promoter (p) sequence and, in the absence of doxy- 
cycline (- Dox), activates the transcription of the transgene. When doxycycline is 
present (+ Dox), it (yellow rectangle) binds to tTA, and the Dox-tTA complex 
cannot bind to the fefO-promoter region, so the transgene is not transcribed. 

(B) Tet-on system. The rtTA gene is driven by a cell-specific promoter (p cs ) and 
encodes a mutated tetracycline repressor sequence (dark yellow) and a transcrip¬ 
tion activator (blue). The product of the rtTA gene is a protein (rtTA [pink circle]) 
that does not bind to a tetracycline operator (fefO)-eukaryotic promoter (p) 
sequence, and in the absence of doxycycline (- Dox), the transgene is not tran¬ 
scribed. When doxycycline is present (+ Dox), it (yellow rectangle) binds to rtTA, 
and the Dox-rtTA complex attaches to the tetO -promoter region and initiates the 
transcription of the transgene. 


encodes the huntingtin protein. A CAG trinucleotide is the codon for glu¬ 
tamine, and during translation, a contiguous set of these codons produces 
a string of glutamine residues (polyglutamine) in the huntingtin protein. 
Symptoms of Huntington disease occur when the polyglutamine-coding 
segment has 38 or more CAG codons. 

To create a mouse model for Huntington disease, the tet-off system was 
used with a variant of the HD gene that consists only of exon 1 with 94 
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FIGURE 21.20 A transgenic mouse model of Huntington disease carrying a mutant 
form of the HD gene encoding the huntingtin protein expressed under the control 
of the tet-off system. CAG 94 indicates a sequence of 94 CAG repeats in exon 1 of the 
mutant HD transgene that encode polyglutamine. p FB , forebrain-specific promoter; 
tTA, tetracycline transactivator; tetO, tetracycline operator; p, promoter. 


CAG repeats as the transgene (Fig. 21.20). The tTA gene was placed under 
the control of a promoter that is active in the cells of the forebrain. Loss of 
embryos was avoided during pregnancy by adding doxycycline to the 
drinking water, which turned off the expression of the mutant HD gene. At 
birth, doxycycline was not supplied to the transgenic mice, which allowed 
continuous expression of the mutant HD gene and the production of a pro¬ 
tein with a long polyglutamine sequence. A neurological condition that 
was similar to Huntington disease in humans developed over time in these 
mice. Interestingly, the features of the disease disappeared when the 
expression of the mutant HD gene was prevented by the addition of doxy¬ 
cycline. Thus, at least in this model, continuous expression of a mutant HD 
gene is required for establishment of the disease, and brain cells can recover 
when this synthesis ceases. 

While mouse models have contributed greatly to our understanding of 
human diseases, the short life span of the mouse may reduce its utility for 
the study of slow and progressive diseases, such as Huntington disease, 
that require observation over a longer time. Transgenic primate models, 
such as the rhesus macaque, may represent more accurate models for the 
study of human neurodegenerative diseases. Transgenic macaques were 
produced that expressed the mutant HD gene encoding the huntingtin 
protein with an expanded polyglutamine sequence. Rhesus oocytes were 
microinjected with a lentivirus vector containing the HD gene with 84 CAG 
repeats under the control of the human polyubiquitin C promoter. From 30 
transplanted embryos, five live monkeys were delivered, and two of these 
died within 1 day of birth. Although the contribution of the HD gene to 
their deaths cannot be established, postmortem analysis confirmed the 
presence of multiple copies of the HD gene and expression of huntingtin 
with a polyglutamine sequence. One of the surviving monkeys carries a 
single copy of the HD transgene and at 6 months old showed features of 
Huntington disease that are found in humans with the disease, including 
involuntary, jerky body movements (chorea) and muscle contractions (dys¬ 
tonia). The macaques will be used to establish a transgenic line to study the 
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pathology of Huntington disease and to assess treatment and diagnostic 
strategies. Therapeutic strategies for Huntington disease will probably be 
directed toward either blocking the expression of the mutant HD gene in 
those people who test positive for more than 38 CAG repeats or clearing 
the abnormal protein from the neurons. In either case, the treatment must 
precisely target the mutant gene or protein, leaving the normal counter¬ 
parts unaffected. 

Conditional Control of Cell Death 

The ability to induce cell death at different times and under defined condi¬ 
tions in a specific organ of a living organism is a helpful way to study organ 
failure caused by cell destruction and to determine how tissues and organs 
recover from various degrees of cell loss. Transgenic mice have been engi¬ 
neered for this purpose. For example, to examine the effects of liver cell 
damage, transgenic mice were created to express a receptor that is required 
for a bacterial toxin to cause cell death. The diphtheria toxin produced by 
the bacterial pathogen Corynebacterium diphtheriae binds to the human 
heparin-binding epidermal growth factor receptor, and the toxin-receptor 
complex is taken up into the cell, where it inactivates elongation factor 2 
(EF-2). Protein synthesis requires functional EF-2 molecules, and cell death 
ensues in the absence of protein synthesis (Fig. 21.21). Mouse cells are not 
normally susceptible to diphtheria toxin because they do not have a 
receptor that recognizes the bacterial protein; therefore, transgenic mice 
were engineered to express the human heparin-binding epidermal growth 
factor receptor under the control of a liver-specific promoter. The trans¬ 
genic mice were treated with high doses of diphtheria toxin, and severe 


FIGURE 21.21 Genetically engineered cell death. (A) Cell membrane-localized human 
heparin-binding epidermal growth factor receptor (HB-EGFr) (brown rectangles) is 
synthesized in liver cells from an HB-EGFr cDNA transgene under the control of a 
liver cell-specific promoter (p Uver ). (B) Diphtheria toxin (yellow and blue ovals) 
binds to HB-EGFr and is taken into the cell. A diphtheria toxin subunit (yellow 
oval) is released from an endosome and inactivates EF-2. Cell death follows the 
cessation of protein synthesis. 
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liver damage occurred; at lower concentrations of the toxin, the extent of 
liver damage was proportional to the dose of the toxin. There was no 
uptake of the diphtheria toxin in other mouse tissues. Also, the presence of 
human heparin-binding epidermal growth factor receptor in the cell mem¬ 
brane of the mouse liver cells had no obvious effects on liver cell functions 
or other processes in the absence of the diphtheria toxin. These mice are a 
convenient model for examining the consequences of moderate to severe 
liver damage due to cell loss. In addition, the effects of selective removal 
(ablation) of various cell types can be studied in mice by combining the 
human heparin-binding epidermal growth factor receptor coding sequence 
with different cell-specific promoters. 


Cloning Livestock by Nuclear Transfer 

In a highly publicized case, a sheep named Dolly was cloned by transfer of 
a nucleus from a mammary (udder) cell of an adult sheep into an egg cell. 
This was the first demonstration of pluripotency (totipotency) of a nucleus 
of a differentiated adult cell. Since the cloning of Dolly, somatic cell nuclei 
have been used to clone cattle, goats, sheep, and pigs. In these cases, the 
nuclear transfer procedures are similar (Fig. 21.22). Briefly, embryonic, 
fetal, or adult donor cells from a variety of cell types (e.g., mammary epi¬ 
thelial and ovarian cells, fibroblasts, and lymphocytes) are isolated, cul¬ 
tured, and genetically modified using methods described above. Although 
not always feasible with adult cells, prolonged culture is preferred, because 
experimenters have additional time to carry out successive genetic altera¬ 
tions, such as inactivating both alleles of a locus or creating multiple gene 
changes. After a cell line with a specific genetic modification(s) is estab¬ 
lished, individual donor cells are fused to an enucleated oocyte with short- 
duration electric pulses. For example, two 2.5-kilovolt/cm pulses for 10 
microseconds each are used to fuse adult cattle fibroblasts with enucleated 
oocytes. The pulses simultaneously induce cell fusion and oocyte activa¬ 
tion. Each fused cell is cultured to the blastocyst stage before being trans¬ 
ferred into the uterus of a pseudopregnant female. At birth, genotype 
analysis is used to confirm the presence of the transgene. 

Generally, the surviving animals produced by nuclear transfer are 
healthy. However, there is substantial loss of individuals before and after 
birth, and some of the cloned animals display abnormalities. Abnormalities, 
such as increased birth weight, are more prevalent in some livestock ani¬ 
mals, for example, in cloned calves and lambs, than in others. One reason 
that has been postulated to explain this poor survival is the failure of the 
donor genome to undergo epigenetic reprogramming, that is, the pattern of 
DNA methylation and histone modification of the original donor cell is 
inappropriately maintained in the cells of the recipient animal. Because the 
epigenetic state of DNA controls gene expression, this could seriously 
impair cellular function. Despite the low efficiency, nuclear transfer has a 
number of advantages over pronuclear DNA microinjection. With nuclear 
transfer, site-specific genetic changes are possible; all offspring are trans¬ 
genic, and small herds of the same sex can be produced within a short time. 
In contrast, with DNA microinjection, transgene integration occurs at 
random sites, expression is often constrained because of the chromosome 
location, unstable tandem arrays are formed, and the establishment of a 
transgenic line requires a number of generations. For these reasons, much 
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FIGURE 21.22 Cloning sheep by nuclear transfer. The nucleus of an ovum is removed 
(dashed arrow) with a pipette. Cells from the mammary epithelium of an adult are 
grown in culture, and the G 0 (quiescent, nondividing) state is induced by inhibiting 
cell growth. A G 0 cell and an enucleated ovum are fused, and the renucleated ovum 
is grown in culture or in ligated oviducts until an early embryonic stage before it is 
implanted into a foster mother, where development proceeds to term. In the exper¬ 
iment described by Wilmut et al. (1997), 277 enucleated ova were fused with G 0 
mammary cells, and 1 of 29 transferred early-stage embryos produced a live lamb. 
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effort is being devoted to perfecting the cloning of livestock with nuclei 
from somatic cells. 


Transgenic Livestock 

There are several reasons for producing transgenic livestock. First, although 
mice have proven to be useful in biomedical research as models of human 
diseases and for testing of disease treatments, the physiology, anatomy, and 
life span of a mouse are different from those of humans. Thus, livestock are 
often better animals in which to model disease processes, gene regulation, 
and immune system development. Second, many livestock animals pro¬ 
duce large amounts of milk and therefore can be used to produce and 
secrete large amounts of recombinant proteins and other molecules of 
pharmaceutical importance. Third, genetic engineering can be used to rap¬ 
idly and specifically improve livestock traits, such as growth rate, disease 
resistance, and milk quality. 

Conceptually, the methods used to generate transgenic cattle are sim¬ 
ilar to those used for transgenic mice. The essential steps in a modified 
mouse transgenesis DNA microinjection protocol (Fig. 21.23) entail (1) col¬ 
lecting oocytes, for example, from slaughterhouse-killed animals; (2) in 
vitro maturation of oocytes; (3) in vitro fertilization with bull semen; (4) 
centrifugation of the fertilized eggs to concentrate the yolk, which in 
normal eggs prevents the male pronuclei from being readily seen under a 
dissecting microscope; (5) microinjection of input DNA into male pronu¬ 
clei; (6) in vitro development of embryos; (7) nonsurgical implantation of 
one embryo into one recipient foster mother in natural estrus; and (8) DNA 
screening of the offspring for the presence of the transgene. 

When this nonsurgical procedure was put to the test for cattle, two 
transgenic calves were obtained from an initial pool of 2,470 oocytes. This 
result indicates that the methodology is feasible although inefficient. The 
poor yield of transgenic calves after DNA microinjection is likely due to the 
low probability of integration. In addition, time and effort are expended in 
rearing nontransgenic embryos. To spare this time and effort, small num¬ 
bers of cells can be taken from a developing embryo prior to implantation 
and assayed for the transgene using PCR. The loss of these cells does not 
interfere with normal development. The test will ensure that only embryos 
carrying the transgene are implanted. 

Production of Pharmaceuticals 

Much of the research with transgenic livestock has been devoted to devel¬ 
oping the mammary glands of these animals as bioreactors for the produc¬ 
tion of pharmaceutical proteins. This is likely due to the greater economic 
incentives, public acceptability, and ethical justification associated with 
using transgenic animals for the production of pharmaceuticals than with 
production of animals and animal products for human consumption. 
Recombinant proteins have been produced in the milk of a variety of trans¬ 
genic mammals. 

Transgenic mice are initially used to test whether a specified protein 
can be secreted into milk. For example, large quantities of the authentic 
cystic fibrosis transmembrane regulator (CFTR) protein are needed to 
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FIGURE 21.23 Steps in the development 
of transgenic cattle. 
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study its function and to formulate potential therapies for treating cystic 
fibrosis, a prevalent genetic disease. The primary effect of a faulty CFTR 
gene is an alteration of the protein that normally acts as a chloride channel. 
As a consequence of the disruption of the proper flow of chloride ions into 
and out of cells, mucus accumulates in the ducts of several organs, espe¬ 
cially the lungs and pancreas. This mucus prevents normal organ function 
and becomes the site of a bacterial infection that is difficult to control with 
antibiotics. 

The yields of CFTR with conventional in vitro cell expression systems 
have been low, possibly because of the biological consequences of the accu¬ 
mulation of CFTR in the cell membranes of transfected cells. The detri¬ 
mental buildup of CFTR in the cell membranes of host cells could be 
avoided if the cell membranes were shed frequently. With such a system, 
not only would a heterologous transmembrane protein be associated with 
the released fragments of plasma membrane, but also, concentrating and 
purifying the recombinant protein would be relatively straightforward. In 
fact, during lactation, fat from within the mammary gland cell is encapsu¬ 
lated by plasma membrane, and together they are secreted into milk as a 
globule. 

To test the feasibility of this concept, a full-length CFTR cDNA sequence 
was cloned into the middle of a defective goat (3-casein gene that had a 
deletion extending from the end of exon 2 to the beginning of exon 7 (Fig. 
21.24). The construct retained the promoter and termination sequences of 
the goat (3-casein gene. The CFTR cDNA was cloned into a structural gene 
to provide introns for enhancing transcription of the transgene. The 
(3-casein gene is actively expressed in mammary glands during lactation, 
and (3-casein is a major milk protein. Transgenic mouse lines carrying the 
CFTR sequence under the control of the (3-casein gene regulatory sequences 
were established. As predicted, the milk of transgenic females contained 
the CFTR protein bound to the membranes of fat globules. There were no 
negative effects on either CFTR-transgenic lactating mothers or pups that 
were fed milk that contained CFTR. The CFTR protein was glycosylated 
and readily extracted from the fat-rich fraction of the milk. Many other 
proteins that are potentially therapeutic for humans have also been synthe¬ 
sized by the mammary gland cells of lactating transgenic mice; however, to 
obtain large quantities of CFTR, other medically important transmembrane 
proteins, and various human therapeutic proteins, the transgenic con¬ 
structs must be incorporated into the genome of a larger mammal, such as 
a cow, sheep, or goat. 

With a method very similar to the one used for producing transgenic 
mice, and with transgene constructs that have mammary gland-specific 

FIGURE 21.24 Goat (3-casein gene-CFTR cDNA expression construct. The full-length 
cDNA for CFTR was cloned between exon 2 (EX2) and exon 7 (EX7) of the goat 
(3-casein gene. The promoter (p) and transcription termination (t) sequences and 
exons 1, 8, and 9 (EX1, EX8, and EX9) of the (3-casein gene were retained. 
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promoters driving human gene sequences, investigators have created trans¬ 
genic sheep, goats, pigs, and rabbits for more than 100 different human 
proteins that are secreted into milk (Table 21.2). The transgene-derived pro¬ 
teins were glycosylated and had other posttranslational modifications. In 
many of these cases, the recombinant proteins had biological activities iden¬ 
tical to those of proteins from human sources. If the mammary gland is to 
be used as a bioreactor, dairy cattle, which each annually produce approxi¬ 
mately 10,000 liters of milk containing about 35 grams of protein per liter, 
are likely candidates for transgenesis. More specifically, if a recombinant 
protein were present at 1 gram per liter of milk and it could be purified with 
50% efficiency, the yield from 20 transgenic cows would be about 100 kg per 
year. Coincidentally, the annual global requirement for protein C, which is 
used for the prevention of blood clots, is about 100 kg. On the other hand, 
one transgenic cow would be more than sufficient for the production of the 
annual world supply of factor IX (plasma thromboplastin component), 
which is used by hemophiliacs to facilitate blood clotting. 

Although the quantity of milk produced by either a sheep or a goat is 
smaller than that produced by a cow, lactation in sheep and goats yields 
hundreds of liters of milk per year (Table 21.3). Goats can also be raised to 
produce milk more rapidly than cows. Recently the U.S. Food and Drug 
Administration approved the human protein antithrombin produced in 
goat's milk for use in individuals with a hereditary deficiency in the pro¬ 
duction of this protein and who are undergoing surgery or giving birth. 
Antithrombin is a protease inhibitor that acts as an anticlotting factor by 
inhibiting the activity of thrombin and other coagulation proteases and 
thereby prevents the excessive formation of blood clots and promotes the 
clearing of clotting factors. It is also has anti-inflammatory activity. 
Approximately 1 in 5,000 people is unable to produce this protein naturally, 
which puts them at risk for heart attacks and strokes. While antithrombin 
can be extracted from the plasma of donated blood, the supply is not suf¬ 
ficient to meet the needs of patients. Blood extraction is less efficient and 
more costly and has a higher risk of contamination with human pathogens 
than milk extraction, and the milk of transgenic goats is a significant source 
of human antithrombin, with yields of 2 to 10 grams per liter of milk. In cell 
cultures, yields on the order of 0.2 to 1 gram per liter of culture medium 
have been attained. It has been estimated that 75 transgenic goats are 
required to meet the annual worldwide demand for antithrombin. It is 
likely that several other human therapeutic proteins produced in trans¬ 
genic goats will be available soon, including other blood proteinase inhibi¬ 
tors, such as antitrypsin; human clotting factors, such as factor IX for the 
treatment of hemophilia; and monoclonal antibodies. 

Production of Donor Organs 

Animals are a potential source of organs for transplantation into humans. 
Human-to-human organ transplants (allotransplantations, or allografts) of 
hearts, livers, and kidneys are 75 to 95% effective for the first year, and on 
average, transplant patients survive for 10 to 15 years. However, throughout 
the world, the demand for donated organs far exceeds the available supply. 
In the United States, for example, more than 80,000 kidney transplants 
were required in 2008, but only 15,000 were performed. With this in mind, 
animal-to-human transplants (xenotransplantations, or xenografts) have 
been proposed as a way to alleviate this disparity. In this context, swine 


TABLE 21.2 Some human proteins that 
have been expressed in the mammary 
glands of transgenic animals 

Antithrombin III 

dj-Antitrypsin 

Calcitonin 

Erythropoietin 

Factor IX 

Factor VIII 

Fibrinogen 

Glucagon-like peptide 
a-Glucosidase 

Granulocyte colony-stimulating 
factor 

Growth hormone 
Hemoglobin 
Serum albumin 
Insulin 

Insulin-like growth factor 1 

Interleukin 2 

a-Lactalbumin 

Lactoferrin 

Lysozyme 

Monclonal antibodies 
Nerve growth factor 
Protein C 

Superoxide dismutase 
Tissue plasminogen activator 
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TABLE 21.3 Milk production and estimated recombinant protein yields 
from organisms used for the expression of transgenes in mammary 
glands 


Organism 

Annual milk yield 
(liters) 

Estimated recombinant 
protein per female (kg/year) 

Rabbit 

5 

0.02 

Pig 

300 

1.5 

Sheep 

500 

2.5 

Goat 

900 

4 

Cow 

10,000 

60 


have been considered the most likely source of organs for xenotransplanta¬ 
tion because their organs are similar in size and physiological functions to 
those of humans, and because they are raised for food, it might be socially 
acceptable to use them as organ donors. 

A major impediment to organ transplantation between species is 
hyperacute rejection of the animal organ. Hyperacute rejection entails the 
binding of preexisting antibodies of the host organism to a carbohydrate 
epitope (a-Gal) on the surfaces of the cells of the grafted organ. The bound 
antibodies elicit an inflammatory response (complement cascade) that 
destroys the antibody-coated cells and leads to the loss of the transplanted 
organ within hours. Under natural conditions, proteins on the surfaces of 
the cells lining the blood vessels protect the cells from the inflammatory 
response. These complement-inhibiting proteins are species specific. 
Therefore, it was reasoned that if the donor animal carried one or more of 
the genes for a human complement-inhibiting protein, a transplanted 
organ would be protected from the initial inflammatory response. 

Transgenic pigs with different human complement inhibitor genes 
have been produced. Hyperacute rejection did not occur after kidneys from 
transgenic pigs were transplanted into a primate host, and survival times 
were 20 to 90 days, depending on the human complement inhibitor 
expressed. The survival times also depended on the levels of immunosup¬ 
pressive drugs that were administered. Another strategy that shows some 
promise in pig-to-primate transplantation trials is the production of trans¬ 
genic pigs with organs that do not produce the antigenic a-Gal epitope by 
deleting the gene encoding 1,3-a-galactosyltransferase. 

The possibility that latent pig pathogens, such as porcine endogenous 
retrovirus (PERV), might become activated after xenotransplantation and 
cause infections in humans must also be addressed. Preliminary informa¬ 
tion indicates that PERV replicates in some established human cell lines, but 
not in newly established cell cultures or in human cells in vivo. Furthermore, 
there is no evidence that PERV produces adverse symptoms in humans. 
Certainly, all likely complications, as well as all of the ethical issues, must be 
resolved before xenografts are considered for clinical trials. 

Disease-Resistant Livestock 

Currently, infectious diseases of domestic animals are controlled by vacci¬ 
nation, drugs, physical isolation, and careful monitoring. The cost of dis¬ 
ease prevention can be as much as 20% of the total production value. The 
development of transgenic animals with inherited resistance to bacterial, 
viral, and parasitic diseases would decrease the use of drugs used to treat 
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these problems, increase productivity, provide safer foods derived from 
these animals, and increase economic benefits. Genetic resistance to bacte¬ 
rial diseases, such as mastitis (mammary gland abscesses) in dairy cattle, 
bovine spongiform encephalopathy (BSE) (also called mad cow disease) in 
cattle, neonatal scours (dysentery) in swine, and fowl cholera, would be 
likely targets. If resistance in each case is genetically determined, it may be 
possible to create transgenic animals with specific protection against a bac¬ 
terial disease after these genes have been isolated and characterized. 

One approach that may be used to develop lines of animals that are 
resistant to infectious agents entails creating inherited immunological pro¬ 
tection by transgenesis. The most favorable preliminary results to date 
have come from research in which the genes encoding the heavy and light 
chains of a monoclonal antibody have been transferred to mice, rabbits, 
goats, and pigs. The rationale behind this strategy is to provide a built-in, 
inherited biological protection mechanism for the transgenic animal that 
eliminates the need for immunization by vaccination. 

The concept of introducing the transgenes for an antibody that binds to 
a specific antigen into a recipient animal is called in vivo immunization. 
Although the animal usually has an intact humoral immune system, 
expression of a monoclonal antibody against a specific pathogen would 
provide immediate protection without prior exposure to the pathogen. If 
the transgene encoding a monoclonal antibody was engineered to be 
secreted into milk, young suckling animals would acquire passive immu¬ 
nity against a pathogen. 

Another strategy to protect livestock from infectious disease is to elimi¬ 
nate production of the host cell component that the infectious agent inter¬ 
acts with through genetic engineering. This strategy has been proposed for 
the prevention of prion diseases. Prion diseases are caused by aberrant 
forms of normal brain proteins. They are infectious in the sense that, once 
they are acquired by a cell, the aberrant proteins induce the normal versions 
of the brain proteins to misfold (Fig. 21.25). The misfolded proteins aggre¬ 
gate and disrupt normal brain function. One prion disease that is particu¬ 
larly problematic for the beef industry is BSE, caused by a mutant form of 
the protein known as prion protein BSE (PrP BSE ) to distinguish it from the 
normal protein, PrP c . Scrapie is a similar prion disease found in sheep. BSE, 
often referred to as mad cow disease for the neuropathological symptoms in 
cows, has caused huge economic losses for cattle and dairy farmers and is 
the motivation for contentious barriers to the trade of livestock between 
countries. There is no known treatment for the disease, and therefore, many 


FIGURE 21.25 Prion proteins that elicit BSE are aberrant forms (PrP BSE ) of a normal 
brain protein (PrP c ). Infection with PrP BSE induces the normal versions of the brain 
proteins to misfold and aggregate, which disrupts normal brain function. Misfolded 
proteins can be transmitted to other animals and induce normal PrP c proteins in 
those animals to misfold. 
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animals have to be destroyed, usually by incineration. Moreover, there is 
some evidence that prions can be transmitted to humans through consump¬ 
tion of prion-contaminated meat products to cause a variant form of 
encephalopathy, or Creutzfeldt-Jakob disease (in humans, the prion is 
referred to as PrP CJD ). This insidious disease can be asymptomatic for very 
long periods but can eventually manifest as mental deterioration. Thus, it 
would be a relief for farmers, and indeed consumers of animal products, if 
livestock could be protected from prion protein infection. 

The potential of engineering resistance by abolishing production of the 
normal version of the protein PrP c was tested first in mice and then in 
cattle, in which both alleles of the gene encoding PrP c were disrupted by 
the insertion of an antibiotic-resistant gene into the coding sequences (Fig. 
21.26). The genetically modified animals were assessed for a variety of 
morphological and physiological features, in particular, for the presence of 
features that are used to diagnose spongiform encephalopathy, including 
mental status, sensory and motor functions, immune function, and brain 
tissue morphology. The brain tissue of infected animals becomes filled with 
holes that give the brain a characteristic sponge-like appearance. In all 
aspects, the transgenic cows were found to be normal and have remained 
normal for almost 2 years. Brain tissue homogenates were collected from 
wild-type and PrP c knockout cattle and incubated with brain homogenates 
of BSE-infected cattle carrying the abnormal version of the prion protein, 
p r pBSE (pjg 21.26). Propagation of PrP BSE could not be detected in the homo¬ 
genates of PrP c knockout animals while it was readily detected in the wild- 
type homogenates. These results indicate that loss of function of PrP c does 
not cause BSE and that the normal PrP c is required for propagation of the 
aberrant form of the prion protein and suggest that the genetically engi¬ 
neered PrP c knockout cattle could be resistant to BSE infection. Tests are 
now under way to determine if the PrP c knockout cattle are resistant to 
challenge with PrP BSE in vivo. 

The bacterium Staphylococcus aureus is responsible for 25% of the cases 
of mammary gland infection (mastitis) in cows. These infections are conta¬ 
gious, recur frequently following termination of antibiotic treatment, and 
readily spread through an entire herd. Milk yields from infected cows are 


FIGURE 21.26 Normal brain protein, PrP c , found in the brains of wild-type cattle, is 
required for propagation of the aberrant form of the protein, known as prion pro¬ 
tein BSE (PrP BSE ), that causes BSE. 
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significantly lowered. Currently, outbreaks of mastitis caused by S. aureus 
cannot be effectively controlled, with an annual cost of about $2 billion in 
the United States. Transgenic cows that secrete a staphylolytic agent into 
their milk to prevent the infections have been produced. Staphylococcus 
simulans produces lysostaphin, a peptidoglycan hydrolase that specifically 
attacks the cell wall of S. aureus. However, when cultured eukaryotic cells 
were initially transfected with the native lysostaphin gene, only inactive 
lysostaphin was produced because two asparagine residues were glycosy¬ 
lated. This problem was overcome by using in vitro mutagenesis to replace 
the codons for these two asparagine residues with those for glutamine (Fig. 
21.27). The modified lysostaphin was nonglycosylated after synthesis in 
eukaryotic cells and was fully active against S. aureus. 

The efficacy of this approach was first tested by engineering mice to 
express the altered lysostaphin gene under the control of the promoter of 
sheep p-lactoglobulin, which is secreted into milk. Based on the successful 
protection of the transgenic mice against large inocula of S. aureus, the 
approach has been extended to produce cows that express the lysostaphin 
transgene. The altered lysostaphin gene under the control of the ovine 
P-lactoglobulin promoter was introduced into cow fibroblasts, and the 
nuclei from these cells were then transferred to enucleated oocytes and acti¬ 
vated. Blastocysts were implanted into the uterus of cows and several calves 
were subsequently bom and used to establish transgenic lines. After nine 
infusions of S. aureus into the mammary glands of high-lysostaphin- 
expressing cows, no infections were observed. In contrast, 71% of the infused 
mammary glands of nontransgenic animals were infected. Moreover, even 
low levels of lysostaphin expression afforded a significant level of protection 
against the pathogen. While these results show promise for solving an 
important problem for the beef and dairy industries, the food safety issues 
associated with milk containing lysostaphin have yet to be addressed. 

Improving Milk Quality 

One of the goals of transgenesis of dairy cattle is to improve the nutritional 
value of milk for humans and for suckling animals. The major nutrients in 
milk are proteins, fat, and the carbohydrate lactose. Overexpression of 
proteins involved in the production of milk nutrients, for example, can 


FIGURE 21.27 Native lysostaphin produced by S. simulans is a peptidoglycan hydro¬ 
lase that cleaves the cell wall of the animal pathogen S. aureus. In animal cells trans¬ 
fected with the lysostaphin coding sequence from S. simulans, lysostaphin is 
glycosylated (yellow stars) and inactive against S. aureus. Alteration of the lyso¬ 
staphin gene to replace the coding sequences for two asparagine residues in amino 
acid positions 125 and 232 (N 125 and N 232 ) with glutamine residues (Q 125 and Q 232 ) 
results in the production of nonglycosylated, active lysostaphin in animal cells. 
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improve the growth, health, and survival of suckling animals. Specific 
components of milk can also be altered to benefit human consumers. The 
amount of cheese produced from milk is directly proportional to the 
(3-casein and K-casein contents. An increase in (3-casein content reduces the 
time required for protein coagulation and whey removal, with the desir¬ 
able result of firmer curds. Increasing the production of these proteins in 
milk was achieved in cows engineered with additional copies of the 
(3-casein and K-casein genes. Other milk nutrients were largely unaffected 
by the presence of the casein transgenes: the vitamin, mineral, amino acid, 
fatty acid, and antibody contents were similar to those in nontransgenic 
milk. The cheese manufactured from the transgenic milk had a higher 
concentration of some amino acids, which increased its nutritional value, 
and a lower fat content. 

Modification of the lactose content of milk would be welcomed by the 
many people who are lactose intolerant due to a deficiency in the produc¬ 
tion of the lactose-hydrolyzing enzyme lactase. Lactose-intolerant individ¬ 
uals experience severe indigestion after the consumption of milk or 
milk-containing foods. Expression of the mammalian lactase transgene in 
the mammary gland could decrease the lactose content of milk; however, 
the presence of some lactose is required for milk secretion. Although this 
has not yet been verified in cows, proof of principle was demonstrated in 
transgenic mice with a 50 to 85% reduction in milk lactose content. 
Similarly, many people who are allergic to bovine milk would benefit from 
the abolition of (3-lactoglobulin, a major allergen in milk. Again, the feasi¬ 
bility of this approach has not yet been demonstrated in cows, mainly 
because the creation of knockout mutants by inserting a selectable marker 
in the protein-coding sequence is much more inefficient in livestock ani¬ 
mals than it is in mice. This is due to the low frequency of homologous 
recombination in the cells of these animals. 

Improving Animal Production Traits 

Improving production traits, such as muscle mass in meat animals, is often 
more difficult than engineering animals to express a foreign gene in milk 
because multiple genes are involved in controlling growth and body com¬ 
position and a detailed understanding of the genetic basis for the traits of 
interest is required. Initially, researchers sought to increase the body mass 
of livestock by introducing genes encoding growth hormones or insulin¬ 
like growth factor. Although these early efforts yielded animals with 
increased ability to convert feed into body weight, they were hampered by 
difficulties in controlling the expression of the transgenes. Overproduction 
of growth hormone was found to adversely affect the animals' health, 
which manifested as gastric ulceration, kidney dysfunction, lameness, 
inflammation of the lining of the heart, immobility of the joints, and sus¬ 
ceptibility to pneumonia; the reasons for these symptoms are not known. 
To overcome this problem, inducible promoters were used, such as the 
metallothionein promoter, which can be activated by zinc administered in 
the animals' diet; however, expression was often poor because the trans¬ 
gene was incorporated into heterochromatin, transcriptionally silent 
regions of the genome. 

Some cattle breeds, such as Belgian Blue, have larger and leaner muscle 
mass than other breeds because of naturally occurring mutations in the 
gene encoding myostatin, a growth factor that normally inhibits the growth 
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of skeletal muscle. Interestingly, humans with rare mutations that disrupt 
the function of myostatin are unusually muscular, and dogs of the whippet 
breed that have enhanced muscle mass, known as "double muscling," and 
increased racing speed carry mutations in both copies of the myostatin 
gene (MSTN). These mutations result in production of a truncated, and 
therefore nonfunctional, myostatin protein. Some researchers have rea¬ 
soned that a transgenesis approach could be used to disrupt myostatin 
function in other cattle breeds to increase meat production. Moreover, if 
disruption could be limited to male animals, then milk production in 
females would be unaffected. The feasibility of this approach has been 
demonstrated in mice, where a myostatin inactivator was targeted to the Y 
chromosome so that it would be expressed only in male animals. The myo¬ 
statin inactivator consists of the N-terminal propeptide domain (latency- 
associated peptide [LAP]) of the myostatin protein. Following proteolytic 
cleavage, the N-terminal propeptide can hold the C-terminal portion of the 
protein, which is the biologically active component, in an inactive state 
(Fig. 21.28). This blocks myostatin activity that would normally lead to 
inhibition of muscle growth. A two-step strategy was used to target the 
gene encoding the myostatin inactivator to the mouse Y chromosome. In 
the first step, a cassette containing positive (Neo r ) and negative (HSV-f/c) 
selectable markers flanked by loxP sites was inserted into a nonessential 
region of the Y chromosome of mouse embryonic stem cells by homologous 
recombination (Fig. 21.29A). Transformed cells were selected by resistance 
to the antibiotic G-418 conferred by the neo gene, and the integration site in 
the Y chromosome was confirmed by PCR. In the second step, a gene 
encoding the myostatin inactivator, under the control of a strong rat skel¬ 
etal muscle promoter and enhancer and also flanked by loxP sites, was 
cloned into a plasmid and introduced by electroporation into the trans¬ 
fected embryonic stem cells carrying the selectable marker cassette (Fig. 
21.29B). A second plasmid encoding Cre recombinase was also introduced 
into the transfected cells. Cre recombinase activity resulted in recombina- 

FIGURE 21.28 Myostatin consists of an N-terminal propeptide domain (LAP) and an 
active C-terminal domain. Proteolytic cleavage and folding yield an active 
C-terminal dimer. Following proteolytic cleavage, the C-terminal dimer can form 
an inactive (latent) complex with the propeptide domain. 
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FIGURE 21.29 Strategy to generate transgenic mice with increased muscle mass. The 
gene encoding the myostatin inactivator, which consists of the N-terminal propep¬ 
tide domain (LAP) of myostatin, was introduced into the Y chromosome of mouse 
embryonic stem cells using a two-step procedure. (A) In the first step, a cassette 
carrying genes encoding selectable markers (neo and tk) flanked by loxP sites was 
introduced into the Y chromosome by recombination between a nonessential 
sequence in the Y chromosome (Y) and a homologous sequence cloned into the 
vector. The positive selectable marker (neo) confers resistance to the antibiotic 
G-418, and the negative selectable marker (tk) confers sensitivity to ganciclovir. (B) 
In the second step, the selectable markers in the Y chromosome were replaced with 
a cassette carrying the myostatin inactivator (lap), under the control of the rat skel¬ 
etal muscle promoter (p) and enhancer (e), by Cre recombinase-mediated exchange 
at the loxP sites. Adapted from Pirottin et al., Proc. Natl. Acad. Sci. USA 102:6413- 
6418, 2005. 


tion between loxP sites and integration of the myostatin inactivator gene 
into the Y chromosome (Fig. 21.29B). Exchange of the myostatin inactivator 
sequence for the selectable marker cassette was initially selected by resis¬ 
tance to ganciclovir, indicating loss of the tk gene, and then confirmed by 
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PCR. Transfected embryonic stem cells carrying the myostatin inactivator 
gene were cultured, inserted into blastocysts, and then implanted into 
pseudopregnant foster mothers. A single transgenic male founder mouse 
was obtained and then mated with nontransgenic females. As expected for 
a gene integrated into the Y chromosome, all male offspring carried the 
myostatin inactivator transgene whereas none of the female offspring were 
transgenic. In the transgenic males, the myostatin inactivator was expressed 
in skeletal muscle tissue and not in heart or liver tissue. Muscle mass was 
greater by 5 to 20% in transgenic males than in nontransgenic male control 
mice, suggesting that this may be a feasible approach to increase meat 
yields in livestock. Larger offspring often lead to birthing difficulties for 
female animals; therefore, it may be necessary to use promoters that can be 
controlled to delay expression of the transgene until after birth. 

A diet rich in omega-3 fatty acids has been extolled as an aid in the 
prevention of cancer; autoimmune diseases, such as arthritis; and a variety 
of other diseases. Omega-3 fatty acids are long-chain polyunsaturated 
fatty acids found mainly in fish. Humans cannot produce these fatty acids, 
nor can livestock animals whose tissues are consumed in human diets; 
rather, they are acquired from a diet containing fish meal, fish oils, and 
flaxseed. The tissues of livestock animals contain high levels of omega-6 
fatty acids, largely because they are fed a grain diet rich in these fatty acids 
(Fig. 21.30). Livestock lack the enzymes to convert omega-6 fatty acids to 
omega-3 fatty acids. Diets with high omega-6 content relative to omega-3 
content contribute to a variety of diseases, including cancer, heart disease. 


FIGURE 21.30 Some omega-3 and omega-6 fatty acids. Omega-3 fatty acids have long 
hydrocarbon chains with double bonds between several carbon atoms. In all 
omega-3 fatty acids, the first double bond is found at the third carbon from the 
methyl (-CH 3 ) end. Omega-6 fatty acids are also long-chain polyunsaturated fatty 
acids; however, the first double bond is found at the sixth carbon from the methyl 
end. In parentheses, the first number refers to the number of carbon atoms in the 
hydrocarbon chain, and the second number refers to the number of double bonds; 
for example, a-linolenic acid (C 18:3 ) has 18 carbons and three double bonds. 
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and diabetes. One strategy to increase the omega-3 fatty acid content in 
the human diet is to produce pigs that synthesize omega-3 fatty acids. The 
roundworm Caenorhabditis elegans produces a desaturase that can convert 
omega-6 fatty acids to omega-3 fatty acids by introducing a double bond 
into the hydrocarbon chain. When this gene was transferred to mice, they 
gained the ability to synthesize omega-3 fatty acids. An optimized version 
of the fat-1 gene (modified codon usage) was cloned into an expression 
vector under the control of the chicken (3-actin promoter and the cyto¬ 
megalovirus enhancer and was used to transfect fetal pig fibroblasts. 
Cultured cells that produced higher levels of omega-3 fatty acids and 
lower levels of omega-6 fatty acids were used to produce fat-1 transgenic 
pigs by nuclear transfer. The transgenic pigs showed threefold-higher 
levels of omega-3 fatty acids and 23% lower levels of omega-6 fatty acids 
than their nontransgenic counterparts, indicating that they produced the 
Fat-1 desaturase and converted omega-6 to omega-3 fatty acids. 

Transgenesis is also being used to address certain environmental con¬ 
cerns. For example, a major ecological problem with the mass rearing of 
pigs and poultry, i.e., monogastric organisms, is the overabundance of 
phosphorus in their fecal material. Phosphorus from pig or poultry manure 
that is stored outdoors or used as fertilizer can run off into water systems 
and cause excessive growth of cyanobacterial and algal populations (algal 
blooms) that in turn deplete the oxygen supply and subsequently kill fish 
and other aquatic organisms. In addition, large amounts of phosphorus in 
the environment are implicated in the production of gases that enhance the 
greenhouse effect and contribute to global warming. 

Pigs and poultry excrete large amounts of phosphorus because, unlike 
ruminants, they are unable to digest and utilize phytate (myo -inositol 
1,2,3,4,5,6-hexakisdihydrogen orthophosphate, or phytic acid) (Fig. 21.31 A), 
the predominant storage form of phosphorus in plant-based animal feeds. 
For instance, the main food source for pigs is soybean meal, which has 
about 50% or more of its phosphate as phytate. The inability to catabolize 
phytate is due to the absence of the enzyme phytase, which is found in 
plants and microorganisms. Most phytases remove successive phosphates 
from phytate to produce inositol 2-monophosphate (Fig. 21.31B) or inositol 
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(Fig. 21.31C). Phytase has been added to animal feed to facilitate the dietary 
uptake of phosphorus and to lower the phosphorus content of the excreta. 
However, this supplement is costly and inefficient because much of the 
enzyme activity is lost during the preparation and storage of the feed. 

As an alternative strategy, it was reasoned that transgenic pigs 
expressing a phytase gene in their salivary glands could overcome the 
nutritional and pollution consequences of phytate in the diet. To this end, 
pronuclear embryo microinjection was used to introduce into pigs a trans¬ 
gene construct consisting of the phytase gene appA from £. coli under the 
control of the parotid secretory protein promoter that constitutively drives 
the transcription of a salivary-specific protein in mice. Established phytase- 
producing transgenic porcine lines were tested for growth and phosphorus 
excretion using soybean meal with 53% of its total phosphorus as phytate. 
Under these conditions, the soybean phytate was almost totally digested 
and the fecal phosphorus content was reduced 75% in comparison to non- 
transgenic controls. No adverse effects were noted among the phytase 
transgenic pigs. Tissues from the pigs that are utilized as meat for human 
consumption contain only trace amounts of the recombinant protein and 
have essentially the same composition as meat from nontransgenic pigs. 
"Enviropig," as it is aptly called, is currently under evaluation by the U.S. 
Food and Drug Administration for approval for commercialization. 


Transgenic Poultry 

Several features that are unique to avian reproduction and development 
make the production of transgenic strains by microinjection of DNA into 
fertilized eggs extremely inefficient. For example, during fertilization in 
birds, several sperm penetrate the ovum instead of one, as usually occurs 
in mammals. As a result, it is impossible to identify the male pronucleus 
that will fuse with the female pronucleus. Also, DNA injected into the cyto¬ 
plasm of the fertilized egg does not integrate into genomic DNA. Finally, 
even if nuclear DNA microinjection were practicable, the technique would 
be difficult to implement because the avian ovum after fertilization 
becomes, in rapid succession, enveloped in a tough membrane, surrounded 
by large quantities of albumin, and enclosed in inner and outer shell mem¬ 
branes. 

Despite these disadvantages, it is possible to inject a transgene into the 
region (germinal disc) on the yolk that contains the female and male pro¬ 
nuclei. The germinal disc is present before the eggshell is formed. After the 
administration of DNA to a germinal disc, each egg is cultured in vitro, and 
when an embryo forms, it is placed in a surrogate egg to produce a hatch¬ 
ling. Despite the technical difficulties, some transgenic lines of chickens 
have been established by this method. 

By the time an avian egg outer shell membrane has hardened, the 
developing embryo (blastoderm stage) has two layers consisting of 30,000 
to 60,000 cells. In trial experiments, inoculation of the blastoderm stage 
with replication-defective retrovirus vectors containing bacterial marker 
genes resulted in a few chickens and quail carrying these DNA sequences 
in their germ lines. Although some of these transgenic organisms did not 
produce virus, the use of retrovirus vectors to deliver genes for a product 
that is to be used as food would ultimately raise questions about its safety, 
whether real or imagined. Moreover, the size of the transgene that can be 
introduced into the recipient organism by retrovirus vectors is limited to 
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~8 kb, and occasionally, integration at the initial site is not permanent. 
Consequently, other methods of transgenesis have been examined. 

The preferred vehicle for transgenesis is pluripotent cells that can be 
maintained continuously in culture and genetically modified by standard 
methods. To date, only blastoderm cells and primordial germ cells, which 
have limited growth in vitro, and stage X (the stage of the embryo in a 
newly laid egg) chicken embryonic stem cells, which survive for 21 days in 
culture, are available for this purpose. Briefly, primordial germ, stage X 
embryonic stem, or blastoderm cells (Fig. 21.32) are removed from a donor 
chick, transfected with a transgene construct, and implanted into the sub- 
germinal space of recipient embryos of freshly laid eggs. At hatching, some 
of the progeny consist of a mixture of cells. An organism with nonidentical 
cells from two or more individuals is called a chimera. In some of the 
chicken chimeras, cells that were descended from transfected cells become 
part of the germ line tissue and form germ cells. Transgenic lines can then 
be established from these chimeras by rounds of matings. Generally, the 
cells of the recipient far outnumber the cells with the transgene. However, 
the proportion of donor cells can be increased to enhance the probability of 
obtaining germ line chimeras. One strategy entails gamma-irradiation of 
the recipient embryos with a dose of 540 to 660 rads for 1 hour before the 
introduction of the transfected cells. The radiation treatment destroys 
some, but not all, of the blastoderm cells, thereby increasing the final ratio 
of donor to recipient cells in the chimeric chicken. Despite its inefficiency, 
this procedure has often been used to produce transgenic chickens. 

Transgenesis could be used to improve the genetic makeup of existing 
chickens with respect to built-in (in vivo) resistance to viral, bacterial, and 
coccidial diseases; better feed efficiency; lower fat and cholesterol levels in 
eggs; and better meat quality. Avian researchers have also suggested that 
the egg, with its high protein content, could be used as a source for phar¬ 
maceutical proteins. By analogy to the mammary gland of livestock, the 
expression of a transgene in the cells of the reproductive tract of a hen that 
normally secretes large amounts of ovalbumin could lead to the accumula¬ 
tion of a transgene-derived protein that becomes encased in the eggshell. 
Ovalbumin constitutes more than 50% of the protein of egg white; there¬ 
fore, expression of a transgene under the control of the ovalbumin pro¬ 
moter and regulatory elements can yield high levels of recombinant 
protein. Yields of up to 1 g of recombinant protein have been achieved per 
egg, and considering that a single hen lays more than 300 eggs per year, the 
productivity of these animal bioreactors could be substantial. The recombi¬ 
nant protein could either be fractionated from the sterile egg packages or 
consumed as a nutraceutical. Currently, as "proof-of-principle" experi¬ 
ments, transgenic chickens that synthesize monoclonal antibodies, growth 
hormone, insulin, human serum albumin, and alpha interferon have been 
created. Regulatory approval of therapeutic proteins produced in eggs may 
be more straightforward, as chicken eggs are already used to produce vac¬ 
cines for injection into humans. 


Transgenic Fish 

As natural fisheries become depleted, production of this worldwide food 
resource has come to depend more heavily on aquaculture. In this context, 
enhanced growth rates, tolerance of environmental stress, and resistance to 
diseases are some of the features that may be created by transgenesis. To 
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FIGURE 21.32 Establishing transgenic chickens by transfection of isolated blastoderm 
cells. Cells from blastoderm donors are removed, transfected with a transgene, and 
inserted into the subgerminal space of an irradiated recipient blastoderm. Some of 
the resulting chickens may be chimeric. The chimeras that have the transgene in 
germ line cells are bred to establish transgenic lines. 









888 CHAPTER 21 


Promoter from 

Salmon GH cDNA 

3' region from 

ocean pout 


ocean pout 

AFP gene 


AFP gene 


FIGURE 21.33 "All-fish" construct used to generate Atlantic salmon with an enhanced 
growth rate. Expression of the growth hormone (GH) cDNA from salmon is con¬ 
trolled by the promoter and terminator-polyadenylation signals from the 3' end of 
the antifreeze protein (AFP) gene from the ocean pout fish. 


date, transgenes have been introduced by microinjection or electroporation 
of DNA into the fertilized eggs of a number of fish species, including carp, 
catfish, trout, salmon, Arctic char, and tilapia. The pronuclei of fish are not 
readily seen under a microscope after fertilization; therefore, linearized 
transgene DNA is microinjected into the cytoplasm of either fertilized eggs 
or embryos that have reached the four-cell stage of development. Unlike 
mammalian embryogenesis, fish egg development is external; hence, there 
is no need for an implantation procedure. Development of transgenic fish 
occurs in temperature-regulated holding tanks. The survival of fish 
embryos after DNA microinjection is high (35 to 80%), and the production 
of transgenic fish ranges from 10 to 70%. The presence of a transgene is 
scored by PCR analysis of either nucleated erythrocytes or scale DNA. 
Founder fish are mated to establish true-breeding transgenic lines. 

Many of the studies with transgenic fish have examined the effect of a 
growth hormone transgene on the growth rate. In one study, a transgene 
consisting of the promoter region from the antifreeze protein gene of a fish 
called the ocean pout, the growth hormone cDNA from salmon, and the 
termination-polyadenylation signals from the 3' end of the antifreeze pro¬ 
tein gene from the ocean pout was injected into eggs of Atlantic salmon 
(Fig. 21.33). This expression system was chosen to enhance the transcrip¬ 
tion of the growth hormone in cold waters. In general, the transgenic 
salmon were larger and grew faster than the non transgenic controls. An 
"all-fish" construct was assembled to avoid possible biological incompati¬ 
bilities that might arise from using a growth hormone gene from nonfish 
sources. For even greater specificity, an "all-salmon" growth hormone con¬ 
struct was formulated and microinjected into sockeye salmon eggs. Young 
transgenic salmon grow much more rapidly than nontransgenic salmon 
and become adult fish that are on average about three to five times larger 
than nontransgenic fish. Theoretically, the faster growth of farmed salmon 
would lower the cost of the feed and lessen the pollution of coastal waters 
in the vicinity of the site of the holding pens. Aquaculture with transgenic 
fish can be carried out within contained facilities; however, the impact of 
the accidental release of transgenic fish on natural populations must be 
considered. 

In addition to enhancing traits that aid the production of fish for food, 
transgenesis can be used to generate systems for monitoring aquatic pollut¬ 
ants. One such biosensor system entails the use of transgenic medaka fish 
genetically engineered to express red or green fluorescent protein under 
the control of pollutant-responsive promoters. The medaka, also known as 
the Japanese killfish, has been a popular aquarium pet for many years and 
is currently gaining rapid acceptance as a model organism in which to 
study biological processes due to its small size, which enables growth in 
small aquariums; hardiness; rapid development; and transparent body. 
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FIGURE 21.34 Transgenic medaka fish as biosensors of environmental pollutants. The 
promoter from the medaka vitellogenin gene (p vit ) was cloned upstream of the gfp 
reporter gene encoding green fluorescent protein, and the construct was introduced 
into medaka fish. The transgenic fish can be used to detect natural and synthetic 
estrogenic compounds because the vitellogenin promoter is activated in the pres¬ 
ence of these compounds and green fluorescent protein is produced. Green fluores¬ 
cent protein can be readily visualized in living fish. 


which facilitates visualization of internal tissues and expression of reporter 
genes, such as those encoding fluorescent proteins. Transgenic medaka 
have been developed as biosensors to detect estrogenic compounds in 
aquatic environments. Estrogens are sex steroid hormones that stimulate 
the development and maintenance of the female reproductive system and 
secondary sex characteristics and also regulate some reproductive func¬ 
tions in males. Synthetic derivatives of natural estrogens are used in most 
oral contraceptives, as a therapy for postmenopausal disorders in women, 
to treat infertility and endometriosis, and to develop female-only fish 
populations in aquaculture. A wide variety of industrial chemicals, such as 
bisphenol A and polychlorinated biphenyls (PCBs), also have estrogenic 
activity in animals and are used in the manufacture of pharmaceuticals, 
plastics, paints, detergents, and insecticides. Thus, large amounts of estro¬ 
genic chemicals are flushed into aquatic ecosystems with domestic, agricul¬ 
tural, and industrial wastewater and sewage and can have a toxic effect on 
aquatic organisms. Indeed, one study has implicated high estrogen levels 
in wastewater effluent in the feminization of wild male fish that can lead to 
severe reductions in fish populations. To develop transgenic medaka to 
monitor levels of natural and synthetic estrogens in water, the estrogen- 
responsive promoter from the medaka vitellogenin gene was cloned 
upstream of the gene encoding green fluorescent protein and then injected 
into medaka eggs (Fig. 21.34). Vitellogenins are normally synthesized in 
females in response to endogenous estrogens, such as 17(3-estradiol, and in 
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males in response to synthetic estrogenic compounds. Exposure of trans¬ 
genic fish to 17(3-estrad iol and other natural and synthetic estrogenic com¬ 
pounds activates the vitellogenin promoter and production of green 
fluorescent protein that can be rapidly and directly visualized as emission 
of green fluorescence under normal light in living fish without additional 
reagents. This system can be expanded to include promoters responsive to 
other environmental toxins, such as heavy metals, that control the expres¬ 
sion of red and green fluorescent proteins, in some cases in the same fish. 
Each color would indicate the presence of a different pollutant. 


SUMMARY 


G enetic modification of animals by recombinant DNA 
technology (transgenesis) entails introducing a cloned 
gene(s) into the genome of a cell that might, after proliferation 
and embryonic development in a receptive female, be present 
in the germ lines of some of the progeny. These founder ani¬ 
mals are then used to establish true-breeding transgenic lin¬ 
eages. The cloned gene can be introduced into the male 
pronucleus of a fertilized egg by microinjection, delivered into 
a fertilized egg by using retroviral vectors, or transfected into 
embryonic stem cells. 

Various strategies have been devised for regulating trans¬ 
gene expression, modifying transgenes, and inducing cell 
death at specific times in particular tissues of a transgenic 
organism. The Cr e-loxP recombination system activates trans¬ 
genes by selectively removing DNA elements that block tran¬ 
scription or inactivates transgenes by excising part of the 
coding sequence. The tet-on and tet-off protocols use the tetra¬ 
cycline analogue doxycycline to either turn on or turn off the 
transcription of a transgene. RNAi is often used to decrease 
(knock down) expression of a target gene. These systems have 
advanced our understanding of gene activity during develop¬ 
ment and the consequences of either gene overexpression or 
loss of gene function within a particular tissue. 

Transgenic mice provide good models for many human 
diseases, such as Alzheimer disease, and are useful as test 
systems to evaluate potential therapies for human and animal 
diseases. They provide important information about the con¬ 
sequences of defective gene products, the course of the dis¬ 
ease, and the effectiveness of different therapies. Transgenic 


mice have also been used to produce recombinant therapeutic 
proteins. The XenoMouse, which synthesizes completely 
human antibodies, was created by transgenesis with a high- 
capacity YAC carrying almost all of the genes for human 
heavy and light antibody chains. 

Genetic manipulation can lead to improved livestock with 
enhanced growth rates and muscle mass, increased resistance 
to common diseases, and improved nutritional content of milk 
and meat products. A major application for transgenic live¬ 
stock is to use the mammary gland as a bioreactor for the 
production of protein pharmaceuticals in milk. Transgenic 
animals have been created by DNA microinjection or nuclear 
cloning. The latter method entails transferring the nuclei from 
cells that have been genetically manipulated in culture to 
enucleated oocytes and obtaining among the progeny some 
animals that carry the transgene in all their cells. 

Transgenesis of birds, especially chickens, can be used to 
improve strain attributes. In addition, eggs may be a reposi¬ 
tory for pharmaceuticals synthesized by transgenic chickens 
or for delivering recombinant-derived medication when an 
egg is consumed. Genetic augmentation of fish has been 
directed primarily to improving growth rates and conferring 
resistance to disease. In addition, transgenic fish are being 
considered as biosensors of environmental pollutants. 
Transgenes encoding fluorescent proteins under the control of 
promoters that are activated when particular contaminants, 
such as estrogenic compounds, are present have been incorpo¬ 
rated into fish for determining the presence of pollutants in 
water. 
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REVIEW QUESTIONS 


1. How are transgenic mice created? 

2. What is positive-negative selection, and how does it 
work? 

3. What are knockout mice? How and why are they estab¬ 
lished? 

4. What are knockdown mice? How are they generated? 

5. Describe how the Cr e-loxP recombination system is used to 
regulate the expression of a transgene. 

6. What are the advantages and disadvantages of using trans¬ 
genic mice as model systems for human diseases? 

7. Describe an example of a transgenic mouse that was devel¬ 
oped as a model system for a human disease. 

8. What is nuclear cloning? 


9. Discuss some ways in which transgenic livestock could 
contribute to human health. 

10. Discuss how the mammary gland could be used as a bio¬ 
reactor for the production of commercial products. 

11. Discuss how transgenesis could be used to improve organ 
transplantation. 

12. Describe a strategy to develop transgenic animals that are 
protected from infectious disease. 

13. Why are pigs carrying a phytase transgene considered to 
be environmentally friendly? 

14. What approaches have been developed to produce trans¬ 
genic chickens? 

15. Discuss how transgenesis might improve fish aquacul¬ 
ture. 
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MOLECULAR 
BIOTECHNOLOGY 
AND SOCIETY 


22 Regulating the Use of Biotechnology 

23 Societal Issues in Biotechnology 


I nevitably, new technologies, especially when they are as widespread 
and pervasive as molecular biotechnology, affect society in diverse 
ways. There can be economic, social, and ethical consequences that 
result from both the implementation of a new way of doing something and 
the displacement of traditional processes. Moreover, because molecular 
biotechnology deals with the genetic engineering of life forms, it impinges 
on a number of socially sensitive issues. As a result, many questions have 
been raised about its propriety, safety, and acceptability to society. In part 
IV, we examine several of the controversial social, regulatory, political, and 
ethical aspects of molecular biotechnology. 
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Regulating the Use 
of Biotechnology 


M ajor technological advances, such as molecular biotechnology, 
are seldom implemented without controversy. Because molecular 
biotechnology can potentially affect many aspects of modern 
society, including food production and medical treatment, there are sig¬ 
nificant ethical, legal, economic, and social issues that need to be consid¬ 
ered. For example, since its inception in 1973, serious doubts have been 
voiced by some individuals about the safety of recombinant DNA tech¬ 
nology. These concerns prompted scientists to declare a self-imposed 
moratorium on certain types of recombinant DNA experiments until the 
adoption of official regulatory guidelines designed to ensure that recombi¬ 
nant microorganisms were unable to proliferate outside the laboratory and 
that laboratory workers were protected from any potential hazard. The 
formulation of these regulations took place in 1974 and 1975 in open meet¬ 
ings under the scrutiny of the press. Thus, the public became aware of the 
possibilities, both negative and positive, of genetically manipulating organ¬ 
isms. Nevertheless, in the late 1970s, apprehension persisted about the 
safety of this new technology. In particular, there was concern about the 
release, either accidental or deliberate, of genetically modified organisms 
into the environment and fear that they would become uncontrolled bio¬ 
logical marauders of vulnerable ecosystems. Additional specific guidelines 
seemed to be necessary to ensure that such rare possibilities would be even 
less likely to occur. 

In 1998, a vociferous and aggressive campaign was launched against 
planting genetically modified crops and marketing products derived from 
them. The reasons for this response are complex and encompass multiple 
concerns that range from human health and safety to environmental pro¬ 
tection, corporate control of the food industry, world trade monopolies, 
trustworthiness of public institutions, integrity of regulatory agencies, and 
loss of individual choice. 

Moreover, there has been much discussion about the ethics of genetic 
manipulation of animals. The objective of these discussions has been to 
distinguish between inappropriate and acceptable procedures. There are 
no easy answers to the ethical, legal, and social questions raised by the 
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various applications of molecular biotechnology. However, because the 
stakes are so high, many of these issues have been examined extensively. 


Regulating Recombinant DNA Technology 

In 1974, when it was realized that recombinant DNA technology could be 
used to engineer organisms with novel genes, concerns about safety, ethics, 
and unforeseen consequences were raised by scientists, the public, and 
government officials. Phrases such as "playing God," "manipulation of 
life," "the most threatening scientific research ever undertaken," and 
"man-made evolution" were often used by the popular press when 
describing recombinant DNA technology. The major apprehension was 
that, either inadvertently or possibly deliberately for the purposes of war¬ 
fare, unique microorganisms that had never previously existed would be 
developed and would cause epidemics or environmental catastrophes. In 
response to a certain amount of public anxiety about these "doomsday 
bugs," a group of prominent molecular biologists called for a moratorium 
on several kinds of recombinant DNA research, especially experiments that 
involved pathogenic microorganisms. 

Subsequently, in 1976, the U.S. National Institutes of Health (NIH), the 
primary U.S. research grant agency in the medical and health sciences, 
issued Guidelines for Research Involving Recombinant DNA Molecides (NIH 
Guidelines). These rules and regulations rigorously defined physical (labora¬ 
tory) containment levels for the conduct of recombinant DNA experiments. 
They also required that biological containment be a component of any 
recombinant DNA experiment, i.e., the preferred hosts for foreign DNA 
would be those microorganisms considered least likely to proliferate out¬ 
side the laboratory or to transfer their DNA to other microorganisms. For 
research with known pathogenic organisms, elaborate negative-pressure, 
controlled, self-contained rooms were recommended. Research that was 
perceived to be less dangerous could be conducted in enclosed contained 
units equipped with high-quality filter systems. Although the NIH Guidelines 
did not have legal status, most researchers, including those who did not 
receive NIH funding and private companies that were starting recombinant 
DNA technology research programs, voluntarily complied. In addition, 
other countries, using the NIH Guidelines as a model, adopted their own sets 
of restrictions for the conduct of recombinant DNA research. 

The initial NIH Guidelines were very stringent, and many scientists 
thought that they were excessive. For example, the costs of the containment 
facilities required by the guidelines effectively prevented smaller compa¬ 
nies and researchers with modest grant support from initiating programs 
using recombinant DNA technology. In anticipation of the need to modify 
the original guidelines, the NIH Recombinant DNA Molecule Program 
Advisory Committee (RAC) was created. This committee was charged with 
overseeing the developments in recombinant DNA research and, if neces¬ 
sary, refining the regulations. The RAC had to hold open meetings to dis¬ 
cuss its decisions, it had to publish and distribute the minutes of its 
meetings, and it had to allow any nonmember the opportunity to address 
the committee on any issue pertaining to recombinant DNA research. The 
membership of the RAC was broad and included both ethicists and mem¬ 
bers of the general public, although it consisted mostly of scientists. 

As part of the original NIH Guidelines, one specific class of experiments 
that was "not to be initiated at the present time" under any circumstances 
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MILESTONE 


Potential Biohazards of Recombinant DNA Molecules 

P. Berg, D. Baltimore, H. W. Boyer, S. N. Cohen, R. W. Davis, 

D. S. Hogness, R. Roblin, J. D. Watson, S. Weissman, and N. D. Zinder 
Science 185:303,1974 


A fter Cohen et al. ( Proc. Natl. 
Acad. Sci. USA 70:3240-3244, 
1973) described the basic 
strategy for inserting a DNA molecule 
from any source into a plasmid, the 
safety and ethical implications of 
recombinant DNA technology were 
discussed at length in the scientific 
community. These informal discus¬ 
sions led to the formation of a tempo¬ 
rary committee (Ad Hoc Committee 
on Recombinant Nucleic Acids) that 
was composed of leading molecular 
biologists who were charged with 
examining the concerns in more detail. 
In 1974, this group published a brief 
note, which was dubbed the "Berg 
letter," simultaneously in three major 
scientific journals: Science, Nature, and 
the Proceedings of the National Academy 
of Sciences of the United States of 


America. In this communication, Berg 
et al. recommended that researchers 
defer creating microorganisms with 
novel genes for drug resistance and 
toxin production and cloning cancer- 
causing genes from viruses into bacte¬ 
rial host cells. Moreover, they 
suggested that a government body, 
which would represent the interests of 
the public, should prepare guidelines 
and regulations for recombinant DNA 
technology and that an international 
meeting of scientists should be con¬ 
vened to "discuss appropriate ways to 
deal with the potential biohazards of 
recombinant DNA molecules." This 
letter had a significant impact. It led to 
a scientific meeting at the Asilomar 
Conference Center in California that 
attempted to define various recombi¬ 
nant DNA experiments in terms of 


their relative risks and suggested 
which laboratory safeguards should 
be implemented for experiments that 
were considered to have a minimal, 
low, moderate, or high risk. In turn, 
the Asilomar recommendations (Berg 
et al., Science 188:991-994,1975) con¬ 
tributed to the development of the 
1976 NIH Guidelines that established 
stringent regulations for all types of 
recombinant DNA experiments. The 
Berg letter was responsible for the for¬ 
mulation of the regulatory framework 
for recombinant DNA technology It 
focused public attention on the poten¬ 
tial risks and benefits of recombinant 
DNA technology and ignited a heated 
public debate about the technology 
Historically, it represents an inter¬ 
esting example of scientists trying to 
anticipate the possible risks of a new 
technology before there was any sub¬ 
stantial evidence that there were any 
hazards. 


was the "deliberate release into the environment of any organism con¬ 
taining a recombinant DNA molecule." However, it was inevitable that 
genetically modified organisms that could function in natural settings 
would be developed. 

By 1980, the original NIH Guidelines were relaxed considerably by the 
RAC as a result of experience and specific experimental data from studies 
that the committee and NIH had sponsored. For example, it was estab¬ 
lished that the host organism Escherichia coli K-12, which was most com¬ 
monly used in recombinant DNA experiments, was unable to proliferate to 
any significant extent outside the laboratory. In addition, microbiologists 
convinced molecular biologists and others that existing safety procedures 
for work with pathogenic organisms were of a high standard and that more 
stringent rules were not required. Finally, it was conceded that it was 
extremely unlikely that a pathogenic organism would be created if the 
cloned gene had nothing to do with pathogenesis in the original organism. 
Most observers were satisfied that the safety of laboratory workers was 
ensured if good laboratory techniques were used. However, specific regu¬ 
lations were added to the guidelines to safeguard against accidental spills 
during large-scale fermentations with genetically modified organisms. 

As a result of the easing of the containment requirements for more 
routine experiments, the use of recombinant DNA technology became 
more prevalent and flourished. The RAC and the NIH Guidelines had effec¬ 
tively quelled, to some extent, the original concern about the potential dan¬ 
gers of recombinant DNA research. However, two significant problems 
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remained. First, what kind of regulation was needed for foods that con¬ 
tained either genetically modified organisms or products that were derived 
from genetically modified organisms? Second, how were deliberate releases 
of genetically modified organisms into the environment to be managed? 

A third potential problem area was avoided when pharmaceuticals that 
were made by recombinant DNA technology were deemed by regulatory 
authorities to be similar to pharmaceuticals produced by traditional means. 
In most countries, there was a strong consensus that existing regulations 
for the approval of pharmaceuticals for commercial use were sufficient to 
ensure both worker safety and public safety and that the process (i.e., 
recombinant DNA technology) by which a product was made was irrele¬ 
vant. The axiom that the product alone should be evaluated for its safety 
and efficacy led to the approval of a range of recombinant DNA products, 
including human drugs, vaccines, and diagnostic devices. 


Deliberate Release of Genetically Modified Microorganisms 

Despite its initial prohibition, by 1982 it became clear that the RAC would 
have to cope with requests for open-field testing of genetically modified 
organisms, i.e., for their deliberate release into the environment. 
Uncharacteristically, neither guidelines nor protocols that advised appli¬ 
cants what information should be included in their submissions had been 
prepared. This initial reluctance to establish definitive regulations was due 
to a widely held belief among many molecular biologists that genetically 
modified organisms were not significantly different from their nonengi- 
neered progenitors, and if a difference was present, it was thought that it 
would be readily detected by conventional biological testing. 

Three applications for field trials of genetically modified organisms 
were received by the RAC in 1982. Two dealt with genetically modified 
plants (corn [maize] and tobacco). The third proposal was concerned with 
testing a genetically modified strain of the microorganism Pseudomonas 
syringae to determine if it could limit the extent of frost damage to plants. 
This particular submission became part of the landmark case for the devel¬ 
opment of regulatory procedures for the release of genetically modified 
organisms into the environment. 

The genetic engineering portion of the P. syringae proposal involved 
removing a gene that coded for an ice nucleation protein from the organism 
and then testing whether the modified "ice-minus" strain, when sprayed 
onto the leaves of plants, could prevent frost damage. Under natural condi¬ 
tions, wild-type "ice-plus" P. syringae, which is usually found on the sur¬ 
faces of plant leaves, secretes a protein that at low temperatures causes the 
formation of ice crystals, which, in turn, causes frost damage to the plant. 
The rationale for the deletion of the gene encoding the ice nucleation pro¬ 
tein was that if a strain that lacked this protein were sprayed onto leaves 
before they became colonized with the wild-type strain, it might lower the 
temperature at which ice formation would occur, thereby preventing the 
leaves from being damaged by bacterially induced ice crystals. There is a 
significant economic incentive for such a novel treatment, because in the 
United States, crop losses due to frost damage cost farmers billions of dol¬ 
lars each year. 

In response to each of the requests for field testing of a genetically 
modified organism, the RAC followed, more or less, the procedures it had 
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established for handling the regulation of recombinant DNA experimenta¬ 
tion in the laboratory. 

1. The submissions were announced in the U.S. Federal Register. 

2. Information was sent to 3,000 interested persons. 

3. A panel of experts reviewed the proposals. 

4. A public meeting was called for discussion of each proposal. 

5. At the same time as the RAC was reviewing the proposals, the U.S. 

Department of Agriculture (USDA) also reviewed them. 

After careful consideration, both the USDA and the RAC approved the 
"ice nucleation gene deletion" proposal. In 1983, the director of NIH gave 
final endorsement to the RAC decision. On the same day that permission 
was granted to proceed with the field trial, a lawsuit to block the test was 
filed by an organization called the Foundation on Economic Trends, which 
is headed by Jeremy Rifkin, who strongly opposes ah forms of genetic 
engineering. The lawsuit was upheld, with the judge noting that the RAC 
had not carried out a proper hearing in accordance with U.S. statutes and, 
more importantly, that it had failed to request an environmental impact 
statement. 

This legal decision dramatically demonstrated that, despite the scien¬ 
tific opinion of the RAC and its experts, the existing regulatory system for 
field testing of genetically modified organisms was inadequate. A prevalent 
opinion outside the confines of the RAC was that the release of a genetically 
modified organism into the environment could have far-reaching effects 
because living microorganisms proliferate, persist, disperse, and sometimes 
transfer their DNA to other microorganisms. Some critics of the release of 
genetically modified organisms into the environment believed that, after its 
introduction into the environment, an engineered organism could displace 
an existing important species from its ecological niche and as a result cause 
severe environmental damage. In addition, some opponents of release 
believed that genes could be transferred from an introduced genetically 
modified organism to indigenous strains, thereby creating, albeit inadver¬ 
tently, an ecologically dangerous organism. Although these points of view 
presented worst-case adverse-effect scenarios that might be exceedingly 
unlikely, it was essential that the regulatory protocol for field testing include 
a thorough assessment of the potential risk that an introduced organism 
might pose for the environment. 

The responsibility for assessing the initial submissions for the delib¬ 
erate release of genetically modified organisms in the United States resides 
with the U.S. Environmental Protection Agency (EPA) and the USDA. The 
NUT drew up an initial set of criteria for field tests with genetically modi¬ 
fied organisms, but it relinquished its authority in this area to these other 
agencies. The EPA decided to use two applications, both dealing with ice 
nucleation-defective bacteria, as prototype cases for developing an assess¬ 
ment process for the field testing of genetically modified organisms. Each 
proposal went through a series of reviews, which included appraisals of 
the environmental fate, ecological effects, and human health consequences 
of the test, as well as product analysis, by the following groups: 

• The Office of Pesticide Program Review of the EPA 

• The Toxic Substances, Research and Development Policy Planning, 
and Evaluation Committees of the EPA 
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• The General Counsel of the EPA 

• The USD A, Food and Drug Administration (FDA), and NIH 

• A Science Advisory Panel that consisted of a microbiologist, a plant 
pathologist, and a community ecologist 

• Open public meetings 

• Various state agencies, which in this instance included the California 
Department of Agriculture 

It was not envisioned that this elaborate, time-consuming, and often 
redundant process would become the routine mechanism for approving 
field testing of genetically modified organisms; rather, it was assumed that, 
with experience, the system would be trimmed without loss of effective 
assessment of the potential hazards of each trial. After what was thought to 
be a very thorough set of analyses, permission was granted for both of the 
field trials with ice nucleation-negative bacteria. However, in both instances, 
although the circumstances were different, local residents who were wor¬ 
ried about the release of a genetically modified organism in their neighbor¬ 
hoods obtained court orders that temporarily blocked each of the field 
trials. As a consequence of this delay, both the EPA and the USD A imple¬ 
mented better methodologies for determining the risks of introducing 
genetically modified organisms into the environment. In a short time, the 
staffs at these agencies became more proficient at handling and analyzing 
the data submitted by the applicants. The scientific community, including 
ecologists, helped the process by initiating research programs that were 
designed to examine the consequences of the release of organisms into 
model environments, and scientific organizations formulated frameworks 
for deciding whether a particular genetically modified organism would 
have an adverse effect on the environment. 

Eventually, in 1987, the field trials with ice nucleation-negative bacteria 
were conducted at sites in California. The results indicated that these 
genetically modified organisms were not dispersed to off-site locations, nor 
did they persist at the site of application. At one site, the freezing tempera¬ 
ture of the test plants was lowered by 1°C. However, for a number of rea¬ 
sons, genetically engineered ice-minus bacteria have not been developed 
commercially to protect crop plants from frost damage. 

Since the first trials of ice-minus bacteria, open-field tests of genetically 
modified microorganisms have become commonplace. Overall, these 
studies have found that introduced microorganisms tend to remain con¬ 
fined to the test area, do not persist for more than a few months, do not 
transfer genes to indigenous microorganisms, and have similar basic bio¬ 
logical functions in natural and laboratory settings. Generally, because 
there can be many different possible adverse effects for each genetically 
modified organism, a case-by-case approach has been adopted for granting 
permission to conduct a field test. These kinds of tests have been carried 
out in the United States, the United Kingdom, Australia, and other coun¬ 
tries. However, biotechnology companies have been reluctant to develop 
genetically modified organisms that can be used in the environment 
because the cost of field testing is high and because final approval may be 
denied despite successful test results. Nevertheless, there is a growing 
consensus that the environmental release of genetically modified organ¬ 
isms after the appropriate laboratory and field tests will not be ecologically 
deleterious. 
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Regulating Food and Food Ingredients 

In the United States, the FDA is responsible for regulating the introduction 
of foods, drugs, pharmaceuticals, and medical devices into the market¬ 
place. The safety of both crop foods and food ingredients that include fla¬ 
vors and additives must be thoroughly assessed before they can be licensed 
for human consumption. The FDA has had a well-established, although not 
foolproof, system for the approval of new foods and food products for 
some time. Critics of the FDA, however, have argued that it tends to favor 
the interests of industry and is too lenient in enforcing its own regulations. 
Both the FDA and the food industry, which is represented by the 
International Food Biotechnology Council, have argued, somewhat con¬ 
vincingly, that new regulations are not required for foods and foodstuffs 
that are developed by recombinant DNA technology because any unli¬ 
censed food or food ingredient, regardless of how it is produced, must be 
assessed for safety by toxicity, allergenicity and impurity testing. The 
approach in the United States has been that food products produced 
through recombinant DNA technology are not considered to be inherently 
riskier than those derived through traditional forms of genetic improve¬ 
ment, such as selective breeding or cross-pollination. In other words, new 
food products arising from genetic improvement, regardless of the method 
used, should be evaluated for risks, not the method by which the product 
was generated. Commercially available food products derived from genet¬ 
ically modified organisms are deemed as safe as those derived from their 
nonengineered counterparts. 

Food Ingredients Produced by Genetically Engineered 
Microorganisms 

Chymosin. A new food product is usually subjected to a large battery of 
tests. However, in order to streamline the process and lower the costs of 
developing a food product, the similarity of the new product to the one that 
it is designed to replace is taken into consideration. For example, the FDA 
approved the enzyme chymosin, an agent produced by recombinant DNA 
technology for use in cheese making, without demanding a full range of 
tests. Chymosin, one of the key components of rennet, is a milk-clotting 
proteolytic enzyme that hydrolyzes the K-casein protein of milk. This enzy¬ 
matic cleavage creates curds, which in turn are processed into cheese. 
Traditionally the milk-clotting agent for cheese making is derived from the 
fourth stomach of calves and consists of a mixture of substances that col¬ 
lectively is called rennet. 

To ensure a reliable, convenient, and possibly cheaper industrial 
supply of chymosin, one of the chymosin genes was cloned and expressed, 
and the product was harvested from £. coli K-12. When a petition requesting 
permission to use recombinant chymosin for the commercial production of 
cheese was presented to the FDA, it was necessary to decide what criteria 
should be required for the approval process. Because there has been a long 
history of using rennet containing chymosin in the cheese-making industry, 
the FDA reasoned that, if the recombinant chymosin was identical to the 
naturally occurring chymosin, then excessive testing was not necessary. In 
essence, the petitioner had to show that the recombinant chymosin was 
identical to the chymosin of rennet. To substantiate this, restriction map¬ 
ping, DNA hybridization, and DNA sequencing were used to establish that 
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FIGURE 22.1 Preparations of L-trypto- 
phan (A) produced in genetically engi¬ 
neered E. coli were found to be 
contaminated with EBT (B). 



the cloned and native DNA sequences of the chymosin gene were identical. 
Moreover, the recombinant chymosin had the same molecular weight as 
purified calf chymosin, and the biological activities of the two forms of the 
enzyme were the same. 

Next, it was essential to establish that the recombinant chymosin 
preparation was safe. The company showed that, as part of its purification 
process, recombinant chymosin is extracted from inclusion bodies and that 
the final preparation is free of whole bacterial cells, significant cell debris, 
and other impurities, including nucleic acids. Although the presence of 
minute amounts of E. coli K-12 cells in the final preparation of chymosin is 
undesirable, numerous studies have established that this strain is nontoxi- 
genic and nonpathogenic to humans. To ensure that the recombinant chy¬ 
mosin preparations did not contain an unexpected toxin, animal testing 
was performed, and the results showed no adverse effects. After compiling 
all the information, the FDA concluded that the recombinant chymosin 
could be licensed for commercial use. Currently, about 85% of all cheeses in 
the United States are produced with recombinant chymosin. 

Tryptophan. For the most part, agencies in various countries that are 
responsible for regulating food and foodstuffs derived from recombinant 
DNA technology have adopted a case-by-case approach. Each submission 
is considered separately, and depending on the judgment of the regulatory 
body, a series of tests is specified to ensure that the product is safe. Although 
the industry prefers and urges that government agencies create a single set 
of standards for all products derived by genetic engineering, there is con¬ 
siderable reluctance to go in that direction. Currently, the introduction of 
recombinant foodstuffs for human consumption is being handled with a 
degree of caution, especially since a false assumption that seems logical 
initially can cause unexpected and perhaps tragic results. 

During 1989 and 1990 in the United States, an unusually high number 
of cases of the disease eosinophilia-myalgia syndrome (EMS) were 
reported. This generally rare disease causes severe, debilitating muscle 
pain and can be fatal as a result of respiratory arrest. A consistent feature 
among the occurrences of EMS was that the patients had been consuming 
large doses of the amino acid tryptophan as a food supplement. In each 
case, the source of the tryptophan was traced back to a single chemical 
company. The possible correlation between tryptophan and EMS was puz¬ 
zling, because there had been no history of significant negative effects 
when tryptophan extracted from E. coli had previously been used as a 
dietary agent. Further investigation revealed that all of the suspected 
batches of the "tainted" tryptophan had been produced by a genetically 
engineered E. coli strain that had been designed to overproduce trypto¬ 
phan. The company had assumed that the enhanced strain was identical to 
the previous one; therefore, no additional product safety tests were thought 
to be necessary. At the same time, what was thought to be a minor step in 
the purification process was changed, while the previous quality control 
measures that had been used to assay the purity of the final preparations 
were retained. 

Chemical analyses of the commercial preparations that had been pro¬ 
duced by the genetically engineered strain revealed that they contained 
novel metabolic derivatives of tryptophan, including 1,1 '-ethyl idenebis[L- 
tryptophan] (EBT) (Fig. 22.1). Initially, the presence of EBT was considered 
to be due to some metabolic quirk in the new strain. While research focused 
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on determining whether EBT caused EMS, other studies showed that EBT 
was produced by wild-type strains, as well. Toxicity studies established 
that EBT induced pathological changes in rats that are similar to EMS and, 
surprisingly, that even tryptophan, to a lesser extent, produced some EMS 
symptoms. Consequently L-tryptophan, even without impurities, was 
banned for human consumption in the United States. It is not clear what 
actually caused EBT to appear in a product that had previously been safe. 
Most observers believe that the change in the purification process allowed 
EBT to contaminate the tryptophan. Presumably, although the company 
was unaware of it, the old method effectively removed it. 

One of the lessons of this episode, even though genetic engineering 
may not have been the problem, is that biological equivalence between a 
strain and its genetically altered counterpart should not be assumed. This 
is as true for a strain produced by traditional methods as it is for one that 
has been genetically engineered. Furthermore, manufacturers are now 
more aware that a minor technical change in the purification phase can 
alter the nature of a product. However, what they do with this knowledge 
may be problematic; many companies do not want to run a complete bat¬ 
tery of toxicity tests for a product that they believe has already been thor¬ 
oughly tested. However, despite economic costs, many manufacturers are 
opting for a "better safe than sorry" approach. 

Bovine somatotropin. The bovine somatotropin (also called BST, bST, 
bovine somatotrophin, or bovine growth hormone) controversy illustrates 
the constellation of issues that can arise from implementing a recombinant 
DNA product. In this case, Monsanto sought and won approval to market 
recombinant bovine somatotropin (Posilac). Concerns about animal health 
and welfare, human food safety, and the socioeconomic impact on small 
dairy farmers all came into play. Also, governmental agencies from dif¬ 
ferent countries disagreed with some of the conclusions of the FDA. 

In the 1930s, it was shown that injection of bovine somatotropin into a 
dairy cow increases its milk yield significantly. Because natural bovine 
somatotropin is both difficult and costly to accumulate in large quantities, 
it has not been used routinely as an augmenting agent by the dairy industry 
However, with recombinant DNA technology, the gene for bovine soma¬ 
totropin was cloned into E. coli and expressed. Recombinant bovine soma¬ 
totropin was harvested, purified, and tested. Under trial conditions, milk 
production in dairy cows was increased by 20 to 25% after injection of 
recombinant bovine somatotropin. 

The safety of natural bovine somatotropin in milk has been studied 
exhaustively In treated cows, the levels of bovine somatotropin in milk are 
not higher than those in control cows. Moreover, bovine somatotropin is 
not active in humans, and all toxicity trials have shown that there are no 
adverse effects on test organisms. The FDA, using all the research results 
that it could assemble, concluded that both the meat and milk of recombi¬ 
nant bovine somatotropin-treated cows are safe for human consumption. 
This conclusion was supported by the U.S. Office of Technology Assessment 
after its independent analysis of many of the bovine somatotropin studies. 
On 5 November 1993, Monsanto's application to use recombinant bovine 
somatotropin as a milk production enhancer was granted. 

An effective and powerful lobby group had been assembled to block 
governmental approval of recombinant bovine somatotropin. The principal 
reason for this opposition was based on the presumed economic conse- 
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quences that recombinant bovine somatotropin would have on the dairy 
industry. The fear was that many small dairy farms would become unprof¬ 
itable because fewer cows would be required to maintain current levels of 
milk production. In addition to the attrition of small dairy farmers, oppo¬ 
nents of recombinant bovine somatotropin thought that the industry would 
become dominated by large corporate interests at the expense of indepen¬ 
dent producers. At the time, these economic arguments may have been 
justified, and certainly any group is entitled to protect itself against what it 
perceives to be a threat to its livelihood. However, the main issue in the 
advertising campaign waged against recombinant bovine somatotropin 
was that "genetically engineered hormones" in "hormone-laced milk" 
would be harmful and cause cancer in humans. Using recombinant DNA 
technology as a bogeyman probably made this campaign emotionally 
effective. 

In addition to economic concerns, the opponents of recombinant 
bovine somatotropin contended that its use would increase the incidence 
of bacterial infection of milk glands (mastitis) in dairy cattle. It was further 
argued that larger-than-normal amounts of antibiotics would have to be 
used to maintain the health of recombinant bovine somatotropin-treated 
animals, thereby resulting in increased levels of antibiotics in the cows' 
milk that might trigger allergic responses in some consumers. Moreover, 
increased antibiotic use could heighten the selection pressure for drug- 
resistant pathogens. The Veterinary Medicine Advisory Committee of the 
FDA and others have studied this issue and concluded that the frequencies 
of mastitis in treated and untreated cows are no different. On this basis, 
there is no reason to believe that the amounts of antibiotics in milk from 
recombinant bovine somatotropin-treated cows would be any greater than 
those from untreated cows. Moreover, by law, after any cow is treated with 
antibiotics, it is not milked for a specified period of time to enable the 
medication to be cleared from its system. 

The concern about an increased risk of cancer was based on the pre¬ 
sumed increased concentration of insulin-like growth factor I in the milk of 
treated cows. The FDA maintains that the amount of insulin-like growth 
factor I in recombinant bovine somatotropin-stimulated milk falls within 
the normal range observed for untreated milk. Moreover, any additional 
insulin-like growth factor I would add only a small amount to the existing 
pool of insulin-like growth factor I in human plasma. Taking into consider¬ 
ation these and other matters, the FDA reviewed the complete recombinant 
bovine somatotropin file in 1999 and found no reason to rescind its original 
decision. 

Notwithstanding the certainty of both the FDA and the Joint Food and 
Agricultural/World Health Organization Expert Committee on Food 
Additives that recombinant bovine somatotropin poses no hazards for 
treated animals, the Canadian equivalent of the FDA (Health Canada) and 
the European Union both refused to approve recombinant bovine soma¬ 
totropin on the grounds that it is detrimental to treated animals. In these 
instances, appointed committees concluded that recombinant bovine 
somatotropin supplementation increases the risk of mastitis, causes leg and 
foot disorders, decreases reproductive capabilities, and induces severe 
reactions at the site of injection. FDA analyses and more recent studies do 
not support these assertions. Clearly, the biological consequences of recom¬ 
binant bovine somatotropin remain controversial. Currently, in the United 
States, about 15% of dairy producers use Posilac. There have been no 
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TABLE 22.1 Land area under cultivation with transgenic crops in countries that 
cultivated more than 1 million hectares in 2008 


Country 

Area 

(million hectares) 

Transgenic crop(s) 

United States 

62.5 

Soybean, corn, cotton, canola, squash, 
papaya, alfalfa, sugar beet 

Argentina 

21.0 

Soybean, corn, cotton 

Brazil 

15.8 

Soybean, corn, cotton 

India 

7.6 

Cotton 

Canada 

7.6 

Canola, corn, soybean, sugar beet 

China 

3.8 

Cotton, tomato, poplar, petunia, papaya, 
sweet pepper 

Paraguay 

2.7 

Soybean 

South Africa 

1.8 

Corn, soybean, cotton 


Adapted from James, Global Status of Commercialized Biotech/GM Crops: 2008, International Service for 
the Acquisition of Agri-Biotech Applications (ISAAA) brief no. 39 (ISAAA, Ithaca, NY, 2008). 


reports since the onset of its commercial use that recombinant bovine 
somatotropin has had debilitating effects on treated cows. Studies also 
indicate that recombinant bovine somatotropin treatment has neither had 
an apparent impact on consumer prices nor led to extensive consolidation 
of small dairy farms. Moreover, a new market niche has been created. In 
many food stores, the consumer is given a choice of milk and milk products 
from either recombinant bovine somatotropin-treated or untreated cows. 

By 1996, many of the initial apprehensions about regulating foodstuffs 
that were developed with recombinant DNA technologies appeared to 
have subsided in the United States as the FDA implemented traditional risk 
assessment procedures. The food industry preferred a minimal set of regu¬ 
lations to speed up the transition from the developmental phase to the 
marketplace. In this context, governmental agencies, as representatives of 
the public, have a dual responsibility. They are entrusted with protecting 
public health and ensuring that new developments are not needlessly cur¬ 
tailed. Presumably, any changes to the regulations will not be made for the 
sake of expediency at the expense of safety. 

Genetically Modified Crops 

While there is still polarization on the issue of genetically modified crops, 
many national governments have approved the commercialization of 
transgenic plants (Table 22.1). The United States cultivates more than 62 
million hectares, which represents about 50% of the total global area 
planted with transgenic crops. In total, 25 countries now grow transgenic 
crops, with developing countries beginning to outpace industrial countries 
in the rate of increase. Four crops, soybean, corn, cotton, and canola, repre¬ 
sent about 99% of the genetically engineered crops, with squash, papaya, 
alfalfa, sugar beet, tomato, sweet pepper, petunia, and carnation making 
up the balance. Most have been engineered to increase productivity by 
incorporating genes that confer tolerance for herbicides and resistance to 
insect predation. Other traits that have been approved but are marketed to 
a lesser extent include resistance to viral infection (especially in papaya) 
and altered plant quality (e.g., flower color in carnations). 

While recognizing the need to increase agricultural production, the 
overarching purpose of regulating the commercialization of transgenic 
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crops, as for any crop, is to ensure that the crops are safe for humans and 
livestock to eat and safe to grow. Not only must the products be nontoxic 
for consumption, there must also be consideration of their potential impact 
on other organisms and on environments outside the area of cultivation, 
because the plants are usually grown in an open-field environment. For 
example, there is potential for plants carrying insecticidal toxins to 
adversely affect nontarget insects or to crossbreed with wild plants, which 
could enhance their invasiveness. In many countries, regulations gov¬ 
erning transgenic crops are still being developed and are evolving. 

In establishing regulations, the general consensus among national 
regulatory agencies has been to consider the characteristics of a transgenic 
plant rather than the process by which it was created. The prevalent view 
of transgenic plants has been that they are not different from traditional 
plant strains (cultivars) that are derived from traditional breeding experi¬ 
ments. In the United States, three agencies are responsible for assessing and 
approving applications for the development and release of genetically 
modified crops. The USDA is responsible for protecting agriculture and the 
environment from pests, the FDA is responsible for the safety of human 
food and animal feed, and the EPA is responsible for regulating pesticides, 
including plants that are engineered to produce pesticides. The same 
testing and licensing procedures are applied to all plants that carry genetic 
modifications, regardless of how these changes were introduced. 

Many other countries have taken a similar approach when adopting 
legislation and follow general guidelines provided by the Organisation for 
Economic Co-operation and Development and the Cartagena Protocol on 
Biosafety, which was established to develop international standards for 
biosafety. The regulatory system in the European Union differs somewhat 
from that in the United States in placing greater emphasis on the process 
by which the crops have been developed. There is also a requirement for 
additional information that is used for labeling and to enable tracking of 
the origin of the plant. 

In general, each new transgenic plant is considered on a case-by-case 
basis. Even when a previously approved transgene is introduced into a dif¬ 
ferent plant variety or into the same plant genotype, each transgenesis 
event is assessed in a new application because the site of insertion into the 
plant genome is random and therefore the potential for disruption of the 
function or regulation of endogenous or introduced genes must be consid¬ 
ered. Other traits that are assessed include the rate and method of repro¬ 
duction, the potential for transfer of genetic material to other organisms, 
toxicity to other organisms, and sexual compatibility with wild relatives. 
For plants engineered with insecticidal proteins, strategies to manage 
insect resistance must also be in place (Table 22.2). 

To date, in the United States, 12,000 field trials for genetically modified 
plants have been authorized. Over 100 transgenic crops have been 
approved for commercialization. Some argue that the lengthy and costly 
pre- and postcommercialization regulatory process is prohibitive to the 
development of new transgenic crops with traits other than herbicide and 
pest resistance, which would be produced on a smaller scale. They also 
argue that the high cost of meeting regulatory requirements, estimated to 
be $20 million to $30 million per product, precludes development by 
anyone other than large multinational companies. Proponents maintain 
that after 20 years of scrutiny and incorporation of many of the safeguards 
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TABLE 22.2 Some of the studies submitted to the U.S. EPA as part of the approval 
process for a variety of Bt corn 

Molecular characterization of insect protected from transgenic corn 
Evaluation of transgenic corn line in U.S. field trials 

Assessment of the equivalence of B. thuringiensis subsp. kurstaki protein with 
commercial Bt corn varieties and microorganisms 
Dietary toxicity study with Bt corn meal in the northern bobwhite 
Aerobic soil degradation study of Bt protein 
Acute oral toxicity study of Bt corn tryptic protein in albino mice 
Assessment of the in vitro digestive fate of Bt protein 
Stability of Bt protein in sucrose and honey solutions under nonrefrigerated 
temperature conditions 

Evaluation of the dietary effects of purified Bt proteins on honeybee larvae 
Evaluation of the dietary effects of purified Bt proteins on honeybee adults 
Dietary toxicity study of activated Bt protein with green lacewing larvae 
Dietary toxicity study of activated Bt protein with the parasitic hymenopteran 
Brachymeria intermedia 

Dietary toxicity study of activated Bt protein with ladybird beetles 
Evaluation of Bt corn feed as ingredient for catfish 

Acute toxicity study of Bt protein with the earthworm in an artificial soil 
substrate 

Effects of Bt protein on Folsomia Candida and Xenylla grisea (Insecta: Collembola) 
Expression of Bt protein in Bt corn 

Tissue expression and corn earworm efficacy of Bt protein in Bt corn 
Chronic exposure of F. Candida to corn tissue expressing Bt protein 
Corn pollen containing Bt protein: 48-h static-renewal test with Daphnia magna 
(Cladocera) 


Adapted from National Research Council, Genetically Modified Pest-Protected Plants: Science and 
Regulation, p. 245 (National Academies Press, Washington, DC, 2000), with permission. 

Bt corn, corn treated with the genetcially engineered Bacillus thuringiensis cry gene to make it resistant 
to pests. 


to ensure that transgenic crops are safe for consumption and release into 
the environment, the regulatory process could be streamlined. 

Despite the successful history of transgenic crops, recent legal cases in 
the United States may indicate that regulatory scrutiny will increase, rather 
than be relaxed as some producers had hoped. In three separate cases, it 
was ruled that field trials—for genetically engineered turf grass in Oregon, 
com and sugarcane in Hawaii, and commercial cultivation of genetically 
engineered alfalfa in California—were approved by the USDA without 
adequate consideration of the environmental impact of the release. 

Somewhat more challenging is the regulation of transgenic plants that 
produce pharmaceutical proteins. Frequently, these are agricultural crop 
plants, such as corn, tobacco, potato, rice, and safflower, although so far 
they have mainly been developed for production in contained facilities, for 
example, in cell cultures. Pharmaceutical plants grown in greenhouses, cell 
culture systems, and other contained facilities are regulated as drugs. The 
plants are genetically manipulated, often with several different genes, to 
maximize yields of proteins that are intended to be biologically active in 
humans or other animals. Thus, the plants potentially pose a greater risk to 
human health and the environment when grown in the field and therefore 
are given special consideration by regulators. For example, there are con¬ 
cerns that material derived from the plants could inadvertently end up in 
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the food chain, perhaps through seed dispersal. Prevention of such prob¬ 
lems may require special confinement measures, such as dedicated farm 
machinery or containment netting, to separate pharmaceutical crops from 
food crops. Employing nonfood crops, such as tobacco, to produce phar¬ 
maceutical proteins would circumvent some of these problems and also 
provide tobacco farmers, who are facing income losses due to a decline in 
tobacco consumption, with an alternate market. 

Genetically Engineered Livestock 

The regulation of genetically engineered livestock is not fundamentally dif¬ 
ferent from that of genetically modified crops. Generally, the same agencies 
in the United States are responsible for regulatory oversight of both. As 
with cultivation of plants, the goal for the commercial production of ani¬ 
mals, regardless of the method of production, is to ensure that the animal 
products are safe for use and that the impact of the animals on the environ¬ 
ment is as low as possible. In addition, the well-being of the animals must 
be protected. For new, rapidly developing, and expanding technologies, 
regulators are challenged with ensuring that these goals are met while rec¬ 
ognizing the power of the technologies to solve important problems in 
agriculture and medicine. Moreover, for acceptance of the products of these 
animals and for commercial viability, society must be confident that safety 
issues have been addressed. 

The regulatory issues for cloned livestock that have not undergone 
modification of their genomes are separated from those for transgenic ani¬ 
mals. In the United States, the Center for Veterinary Medicine of the FDA 
considers that the cloning of livestock through somatic cell nuclear transfer 
is not different from other assisted reproductive technologies that have 
been practiced for many years and does not pose greater risks. Many 
cloned animals are unhealthy, most likely because the genome transferred 
from a differentiated somatic cell fails to undergo epigenetic reprogram¬ 
ming. This normally returns the genomes of differentiated egg and sperm 
cells to a nascent state in an early embryo, which is required for normal 
development. However, sick animals are generally destroyed, and their tis¬ 
sues are not consumed as food by humans or other animals. Healthy 
cloned animals are used primarily as breeding stock. Their offspring, 
which are consumed as food, are produced through sexual reproduction 
and therefore develop normally (i.e., undergo proper epigenetic program¬ 
ming) and are generally born healthy. The FDA takes the position that 
healthy animals produce safe foods. After extensive assessment of meat 
and milk composition, it was found that meat and milk derived from 
cloned animals is not different and is as safe to eat as that from animals 
produced using conventional agricultural practices. Additional regulations 
specific for cloned animals were determined to be unnecessary because any 
problems associated with the products from cloned animals can be identi¬ 
fied through food inspections in current practice. 

All genetically modified animals must be approved before they are 
commercialized, and to date, no animals that enter the food chain have 
received regulatory approval. In the United States, genetically engineered 
animals, whether containing heritable or nonheritable recombinant DNA 
and regardless of their intended use, are considered to be "new animal 
drugs" by the FDA. Each time a gene is introduced into an animal, approval 
must be obtained, even when the same gene is introduced into different 
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animals. As for transgenic plants, it is generally recognized that the site of 
insertion into the animal's genome is difficult to control and may affect the 
level of expression of the transgene or the animal's native genes and there¬ 
fore could affect the health of the animal. All animals containing a recom¬ 
binant DNA construct are regulated, including subsequent generations of 
animals that contain the construct that were derived from the original 
manipulated parent animals by breeding with a nontransgenic animal. 

A distinction may be made between transgenic animals that are to be 
used for food production and nonfood transgenic animals. Transgenic ani¬ 
mals that are used for nonfood research purposes (e.g., laboratory mice) or 
to produce human therapeutic proteins are contained within controlled 
environments and may not be as strictly regulated as those that are grown 
in open environments for food production unless safety issues are raised. 
Laboratory animals and animals raised in contained facilities for produc¬ 
tion of pharmaceuticals pose a low risk for unintended release into the 
environment. A transgenic goat that was engineered to produce human 
antithrombin protein was approved by the FDA in 2009 and was the first 
transgenic animal to receive regulatory approval for commercialization. 
Antithrombin is extracted from goats' milk and is used to prevent blood 
clots from developing during surgery and childbirth in humans who are 
unable to produce sufficient amounts of the protein. Genetically engineered 
animals, such as insects, that have been developed for nonfood uses, like 
biocontrol of pests, and that may be released into uncontrolled environ¬ 
ments are evaluated by the EPA. 

To gain approval, a genetically engineered food animal must be demon¬ 
strated to be safe for production and consumption, and the introduced char¬ 
acteristic must be shown to be effective as intended. The risk assessment by 
the FDA and the regulatory agencies of many other countries is consistent 
with those of the international standards still under development by the 
Codex Alimentarius Commission of the Food and Agriculture Organization 
and World Health Organization. The FDA considers several characteristics, 
including the method of introduction and detection and the nature of the 
introduced DNA (e.g., expression, stability, and potential for the construct to 
recombine with pathogens); the traits conferred by the introduction of the 
transgene(s) and any alterations from the normal phenotype of the corre¬ 
sponding nontransgenic animal, especially where it may impact the compo¬ 
sition of the animal product (e.g., production of allergens, toxins, and novel 
metabolites and potential for unintended cell deregulation); and methods 
for production, processing, and disposal of transgenic animals and their tis¬ 
sues. Regulations also take into consideration the risk associated with the 
release of a genetically modified animal into the environment, whether 
intentional or accidental. Measures to prevent the escape of farm animals, 
and their products, from bams and pastures are required. Transgenic ani¬ 
mals must be prevented from mating with nontransgenic relatives, which 
could spread the transgene, disrupt an ecosystem, or reduce biological 
diversity. Aquaculture facilities that raise transgenic fish, for example, must 
ensure that the fish cannot escape and poachers cannot get in. 

Patenting Biotechnology 

The principal objective of biotechnology is to produce commercial products 
for economic gain. However, no company will initiate high-risk, long-term 
projects without knowing that the results of its research efforts can be 
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legally protected from competitors. At the same time, society at large has a 
stake in encouraging industrial innovation. A strategy that meets both of 
these objectives is for the government to grant inventors exclusive rights to 
the novel products or processes that they develop. Collectively, these sanc¬ 
tioned privileges are called intellectual property rights and include trade 
secrets, copyrights, trademarks, and patents. Trade secrets comprise pri¬ 
vate information about specific technical procedures and formulations that 
a company wishes to protect from others. Copyrights protect the author¬ 
ship of published works from unauthorized use. Trademarks can be either 
words or symbols that identify a particular product or process of one com¬ 
pany. For example, the term FailSafe is the legally recognized designation 
for a polymerase chain reaction (PCR) procedure marketed by Epicentre 
Technologies. Other companies that sell similar PCR kits have created their 
own protected names. 

For biotechnology, patents are the most important form of intellectual 
property. A patent is a legal document that gives the patent holder exclu¬ 
sive rights to implement the described invention commercially. Moreover, 
on the basis of the extent of the claims of the patent, the patent holder can 
develop other products that are directly derived from the original inven¬ 
tion, while competitors would have to license the right to use the invention 
in order to develop a product based on it. On the other hand, a patent is a 
public document that must contain a detailed description of the invention, 
so it informs others about the nature and limits of the invention, allowing 
them to decide whether they should continue working in a particular direc¬ 
tion or try to use the patented invention as a springboard to other possible 
innovations. 

Patent decisions and laws vary from country to country, although there 
are ongoing attempts to develop international standards. The duration of 
the exclusive rights of a patent is 20 years from the date that the application 
is filed in all countries. Flowever, in the United States, if there is a dispute 
about priority, the applicant who was first awarded the patent has the 
rights for 20 years (first-to-invent principle). In almost all other countries, 
the patent is given to the applicant who filed first (first-to-file principle). 
Usually it takes 2 to 5 years following the filing of the initial patent applica¬ 
tion before a patent is granted. In every jurisdiction, the holding of a patent 
can be of considerable economic value, and it is not a trivial matter to 
receive one. For this reason, both the patent application and the invention 
must meet a very strict set of criteria. 

Generally, for either a product or a process to be patentable, it must 
satisfy four fundamental requirements: 

1. The invention, after having been shown to work ("reduced to 
practice") must be "novel," meaning that the invention does not 
exist as another patent that is held by someone else in another 
country; is not an existing product or process; and, outside the 
United States, has not appeared in some published form before the 
submission of the patent application. In the United States, an 
inventor has 1 year following publication in which to apply for a 
patent. 

2. A patent cannot be granted for something that was merely previ¬ 
ously unknown, i.e., a discovery; rather, the invention must con¬ 
tain, as judged by the patent office, an inventive step that was "not 
obvious" to other workers in the field. 
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3. The invention must be "useful" in some way, whether it is a pro¬ 
cess, an instrument, a compound, a microorganism, or a multicel¬ 
lular organism. 

4. Every patent application must contain a description of the inven¬ 
tion that is sufficiently thorough that a person knowledgeable in 
the same field can implement it. 

A patent cannot be granted for anything that is "a product of nature." 
The notion here is that it is not appropriate for society to give a monopoly to 
someone for something that occurs naturally, has merely been discovered, 
and therefore belongs to the public. Often companies and individuals skirt 
this constraint by applying for a patent that covers the process of purifica¬ 
tion of a product, thereby avoiding the direct question of ownership of either 
a natural substance or an organism that produces the product. In some coun¬ 
tries, such as the United States, according to the Supreme Court in a land¬ 
mark decision, virtually "anything under the sun that is made by man" is 
patentable; however, in other countries, including members of the European 
Union, therapeutic and diagnostic procedures are not patentable. 

There is no simple, immediate system for the granting of a patent. The 
application must be prepared by an expert, normally a patent lawyer, and 
is organized in a defined pattern. In the United States, the application must 
have a title; an abstract describing succinctly the nature of the application; 
a section on the background of the invention that includes a full and open 
description of the current "state of the art" in the field of the invention; a 
comprehensive summary of the invention with, if considered helpful, fig¬ 
ures and schematic representations; sections that explain the nature of the 
invention and describe how the invention works; and, finally, a list of 
claims about the invention and how the invention may be used. The appli¬ 
cation is sent to the U.S. Patent and Trademark Office (PTO), where it is 
reviewed by an examiner for novelty, nonobviousness, utility, feasibility 
and general acceptability as a patentable invention. 

If an examiner agrees that the invention meets all the criteria for pat¬ 
entability then a patent is awarded. However, the receipt of a patent is not 
a license to produce and sell the invention. All statutory regulations must 
be met before any product can be marketed. For example, if a patent is 
granted for a genetically engineered microorganism, the manufacturer 
must satisfy the recombinant DNA regulations for its production, distribu¬ 
tion, and release. Protection of patent rights is the responsibility of the 
patent holder, and generally that means bringing a lawsuit(s) against those 
who are presumed to be infringing on the patent. These disputes are 
decided by the courts and not the patent office. Similarly if a person or 
company feels that an awarded patent is inappropriate, the legitimacy of 
the patent can be challenged by a lawsuit. 

If a patent application is rejected by an examiner, then the applicant can 
appeal the decision to a Patent Appeals Board. If this appeal is turned 
down, then the decision can be challenged legally. For some applicants, 
patenting can be a frustrating experience. There are a number of cases in 
which the stakes are considered to be so high that costly court battles go on 
for years. Thomas Edison, who held more than a thousand patents in his 
lifetime, once described a patent as "an invitation to a lawsuit." 

Product patents and process patents make up the two major categories 
of patents. Products include homogeneous substances, complex mixtures, 
and various devices; processes include preparative procedures, method- 
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TABLE 22.3 Common types of patent categories with examples of biotechnology 
inventions 


Category 

Examples 

Product patents 

Substance 

Cloned genes, recombinant proteins, monoclonal 
antibodies, plasmids, promoters, vectors, cDNA 
sequences, antigens, peptides, RNA constructs, 
antisense oligonucleotides, peptide nucleic acids, 
ribozymes, and fusion proteins 

Composition of matter 

Multivalent vaccines, biofertilizers, bioinsecticides, 
host cells, microorganisms, transformed cell 
lines, and transgenic organisms 

Devices 

Pulsed-field gel electrophoresis apparatus, DNA- 
sequencing units, and microprojectile gene 
transfer machine 

Process patents 

Process of preparation 

DNA isolation, synthesizing double-stranded 

DNA, vector-insert construction, PCR applica¬ 
tions, and purification of recombinant proteins 

Method of working 

Nucleic acid hybridization assays, diagnostic pro¬ 
cedures, and mutation detection systems using 
PCR 

Use 

Applying biofertilizers and bioinsecticides, fermen¬ 
tation of genetically modified microorganisms, 
and nontherapeutic animal treatment systems 


ologies, or actual uses (Table 22.3). The patenting of biotechnology innova¬ 
tions has been based on the historical experience of patenting inventions 
by the agricultural, fermentation, pharmaceutical, and medical industries. 
For example, in 1873, Louis Pasteur received two patents (U.S. patents 
135,245 and 141,072) for a process for fermenting beer that included the 
living organism (yeast) used in the process. Today, most but not all bio¬ 
technology patent applications are straightforward, and patents are 
granted without any significant problems. However, the first time a scien¬ 
tist attempted to patent a genetically modified microorganism that was 
engineered by the introduction of different plasmids, each of which car¬ 
ried the genes for a separate hydrocarbon degradative pathway, the case 
was highly controversial. This genetically modified bacterium, which was 
capable of breaking down many of the components of crude oil, was 
developed by A. Chakrabarty, who at the time worked for the General 
Electric Corporation. However, despite its potential usefulness in cleaning 
up oil spills, the patent application for the bacterium was rejected by the 
U.S. PTO on the grounds that microorganisms are products of nature and, 
as living things, are not patentable. In 1980, in a landmark decision, the 
U.S. Supreme Court decided that this organism was patentable according 
to the U.S. Patent Statute, arguing that "a live, human-made microor¬ 
ganism is patentable subject matter...as a manufacture or composition of 
matter." 

The argument against patenting this genetically engineered microor¬ 
ganism tended to center on how the organism was developed. In the past, 
induced mutation followed by selection for novel properties was an accept¬ 
able way to create a patentable living organism. However, genetic engi- 




Regulating the Use of Biotechnology 915 


neering was considered by some to be "tampering with nature." 
Consequently, it was argued that no inventor should benefit from manipu¬ 
lating "products of nature." This position was not upheld. Thus, in the 
United States from 1980 onward and later in other countries, organisms, 
regardless of the means that were used to develop them, must be judged by 
the standard criteria of novelty, nonobviousness, and utility to determine if 
they are patentable. 

Patenting in Different Countries 

The rights given by a patent extend only throughout the country in which 
the application was filed. Therefore, to protect an invention, a patent appli¬ 
cation must be filed separately in each country and although the World 
Intellectual Property Organization is attempting to develop international 
standards, patent offices in different countries often reach quite different 
conclusions about the same patent application. For example, in 1989 the 
biotechnology company Genentech applied for a patent in the United 
Kingdom for, among other things, the production of human tissue plasmi¬ 
nogen activator (tPA) by recombinant DNAprocesses. This protein exists in 
small amounts in the human body and converts plasminogen to plasmin. 
Plasmin is an active enzyme that degrades the fibrin of a blood clot. 
Consequently human tPA has been considered as a possible therapeutic 
agent for the prevention and treatment of coronary thrombosis. Genentech, 
after considerable effort, assembled a complete version of a human tPA 
complementary DNA (cDNA) and cloned this cDNA into £. coli for the 
production of large amounts of pure tPA. In its patent application, 
Genentech claimed rights to human tPA as a product based on certain pro¬ 
cedures of recombinant DNA technology that they developed, the cloning 
vector system, and the transformed microorganism. As a part of the "pro¬ 
cess" category, protection for the use of human tPA as a pharmaceutical 
agent was also sought by Genentech. A total of 20 claims were presented in 
the original patent application. Some of these were broad and others were 
narrow in scope. The patent was rejected by the United Kingdom's patent 
office. Genentech then appealed to the United Kingdom's Court of Appeals, 
which, after considerable deliberation, invalidated all of the claims for a 
variety of reasons. The judgment concluded that the patent was novel, but 
some of the judges argued that the submission was obvious; therefore, it 
could not be patented. 

In contrast, Genentech was readily awarded a patent for human tPA in 
the United States. The U.S. patent not only protects the form of human tPA 
that was to be marketed by Genentech but also gives Genentech exclusive 
rights to all similar, but not identical, active forms of human tPA. Genentech 
won a lawsuit against two other biotechnology companies that were found 
to be infringing on its tPA patent, although they were selling nonidentical 
forms of tPA. 

The Japanese version of Genentech's tPA patent is limited to the amino 
acid sequence of the human tPA that was cloned and patented by 
Genentech. In Japan, other companies can sell variant forms of human tPA. 
Thus, basically the same patent application was rejected, approved and 
given a broad interpretation, and approved and given a narrow interpreta¬ 
tion by three different patent offices. For the present, at least, there are 
divergent views about what is or is not a patentable invention. 
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Patenting DNA Sequences 

Currently, isolated nucleic acid sequences, whether DNA, RNA, or cDNA 
derived from RNA, and proteins are patentable. Although gene sequences 
and the mRNA and proteins encoded in these sequences are found natu¬ 
rally in organisms, purification from their natural state is considered suf¬ 
ficient to render them patentable. Since 1980, thousands of patent 
applications for whole genes have been approved by patent offices 
throughout the world. In the United States, more than 40,000 DNA-related 
patents have been issued, at a rate of 3,000 annually since 1998, and almost 
20% of human genes have been patented. Some of these are used to make 
therapeutic proteins such as recombinant erythropoietin. Erythropoietin 
stimulates the formation of red blood cells and is used to prevent anemia 
in patients with kidney failure who require dialysis. Many of the other 
patented gene sequences are used as diagnostic probes. One example 
results from the discovery that particular mutations in the human gene 
BRCA1 are linked to breast cancer. A patent, issued to Myriad Genetics Inc., 
claims methods to detect these mutations in BRCA1 to diagnose a predis¬ 
position to breast cancer. 

With the rapid accumulation of genetic sequences from genome 
sequencing projects and the undertaking of the partial sequencing of thou¬ 
sands of cDNA molecules from different organisms, tissues, and organs, 
the patenting of nucleic acid sequences became extremely contentious. In 
1991, the issue of patenting gene fragments was broached when scientists 
from the U.S. National Institutes of Health filed for the patent rights for 315 
partially sequenced human cDNAs (expressed sequence tags [ESTs]). Two 
additional filings brought the total number of partial sequences to 6,869. In 
1994, in a preliminary ruling, the U.S. PTO notified the National Institutes 
of Health that it would reject the patent application on the grounds that the 
functions of the sequences were not known. In other words, partial 
sequences by themselves did not fulfill the requirement of utility and were 
not patentable. However, by 1997, over 350 patent applications for more 
than 500,000 partial DNA sequences had been filed, mostly by private com¬ 
panies, which purportedly met the standard for usefulness. One of these 
patent proposals sought protection for about 18,500 ESTs. Consequently, 
serious concerns were raised about granting patents for large numbers of 
sequenced genes and partially sequenced DNA fragments with broadly 
based applications. 

Individuals who opposed the patenting of DNA fragments with 
unknown or loosely defined functions contended that genes and partial 
DNA sequences are discoveries or, more likely, products of nature and 
definitely not inventions. Others conceded that, although eventually some 
of these sequences might be useful, it was premature and speculative to 
award patents without additional information about the functions of the 
sequences. In this context, the thousands of ESTs are considered to be 
"means to ends" and not the actual end points. On the other side, those 
who favored patenting ESTs maintained that these collections were novel 
because they defined the normal messenger RNA (mRNA) complements of 
various tissues and organs and consequently had utility because each col¬ 
lection could be used as a diagnostic assay to determine the extent to which 
a disease alters the normal complement of mRNAs in various organs. 

After developing some ad hoc rules, the U.S. PTO examined in more 
detail a full range of issues and concluded that genes and partial DNA 
sequences were patentable. On January 5, 2001, a set of guidelines for gene 
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patenting was released. The key requirement for this type of application 
was that each DNA sequence must have "specific and substantial credible 
utility." Moreover, the written specifications and claims for each sequence 
must be thorough and demonstrate the actual use of each sequence and not 
merely a potential function. These guidelines have established the criteria 
for patenting incomplete DNA sequences in the United States, although the 
granting of patents for any human DNA sequence remains controversial. 

Also controversial is the increased scope of the DNA patent claims in 
recent years. Often included in the claims to DNA sequences and the pro¬ 
teins they encode are antibodies against the protein, even when the antibody 
has not actually been produced. Antibodies against human proteins are 
important from a commercial perspective because they are used for diag¬ 
nostic purposes and as therapeutic agents, and therefore, there is consider¬ 
able incentive to include them in patent applications. However, some argue 
that they do not meet the criteria for patentability because in many cases the 
antibodies have not been produced and therefore their characteristics are not 
specifically described and working examples are not provided. 

Patenting Multicellular Organisms 

The patenting of multicellular organisms continues to raise ethical and 
social concerns. However, there is nothing intrinsically new about the 
exclusive ownership of living material. In the past, microorganisms were 
routinely patented, and specific laws were promulgated to give plant 
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Transgenic Non-Human Mammals 

Inventors: P. Leder and T. A. Stewart 

Assignee: President and Fellows of Harvard College, Cambridge, MA 
U.S. patent 4,736,866,12 April 1988 


I n 1980, the U.S. Supreme Court 
defined a patentable invention as 
one that included "anything under 
the sun that is made by man." In 1988, 
a transgenic mouse was the first 
genetically engineered animal to be 
patented. In this case, the transgene 
consisted of a cancer-causing gene 
(oncogene) driven by a promoter in 
the long terminal repeat of the mouse 
mammary tumor virus (MMTV LTR). 
The oncogene was the myc gene from 
the chicken myelocytomatosis OKIO 
virus. The invention entailed cloning 
an MMTV LTR-myc fusion gene into a 
plasmid, injecting linearized plasmid 
DNA into the male pronuclei of fertil¬ 
ized one-celled mouse eggs, identi¬ 
fying offspring that expressed the myc 
gene, and establishing transgenic 
mouse lines. In some of these lines, 
the myc gene was expressed in several 


different tissues, and in other lines, it 
was limited to one or a few tissues. 

The integration of the MMTV 
LTR-myc gene construct, according to 
Leder and Stewart, "increases the 
probability of the development of neo¬ 
plasms (particularly malignant 
tumors) in the animal." These trans¬ 
genic organisms can be used to test 
whether a compound either causes or 
prevents cancer and as a source of cell 
lines from cells of various tissues, such 
as the heart, that are difficult to cul¬ 
ture. Since 1989, Du Pont has been 
selling one of these lines of transgenic 
mice under the trade name 
OncoMouse. More generically, others 
prefer to call this mouse line the 
"Harvard oncomouse" or, for short, 
just "oncomouse." 

The granting of U.S. patent 
4,736,866 was contentious, with much 


of the concern directed at the ethical 
implications of such patents. Those 
who oppose the patenting of trans¬ 
genic animals argue that this type of 
patent violates the sanctity of life, 
threatens the integrity of species, and 
fosters inhumane treatment of ani¬ 
mals. Despite these allegations, since 
1988, hundreds of patents have been 
granted in the United States for var¬ 
ious transgenic organisms. For 
example, there are now patents for 
transgenic animals that act as models 
for benign prostatic disease, inflam¬ 
matory disease, altered fat tissue 
metabolism, and thrombocytopenia, to 
name a few. To date, neither the U.S. 
courts nor the U.S. government has 
suggested that, in principle, any of 
these patents is inappropriate. The 
patenting of transgenic organisms is 
no longer an issue in the United 
States, and after much discussion and 
litigation, transgenic animals are now 
patentable in most jurisdictions 
throughout the world. 
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breeders the right to own various plant varieties. The transgenic mouse 
("OncoMouse") that carries an activatable gene that makes it susceptible to 
tumor formation has been the precedent-setting case in many jurisdictions 
to determine whether genetically modified animals are patentable. 
Currently, patenting of genetically modified animals is sanctioned in most 
developed countries, including the United States, members of the European 
Union, Japan, Australia, and New Zealand. However, in other countries 
such as Canada, it is still not possible to patent a multicellular organism 
such as the OncoMouse. 

Vigorous challenges to patenting transgenic animals have been put 
forward on moral grounds. In other words, the issue is whether society 
considers this form of patenting acceptable. From a historical perspective, 
it is unlikely that a position based on ethical considerations will be com¬ 
pletely successful in preventing the patenting of all transgenic animals. For 
example, if an invention purports to facilitate a new treatment for human 
disease, the currently prevalent view in most countries is that human rights 
and needs supersede those of animals. However, patenting is not an abso¬ 
lute right, and governments, by passing specific laws, can determine what 
can or cannot be patented. If an invention is considered by various special 
interest groups to have a potentially negative economic impact on an 
existing agricultural practice, for example, then it is quite possible that a 
law preventing the implementation of the new technology could be 
passed. 

Patenting and Fundamental Research 

Not everyone believes that patenting is worthwhile. Some opponents argue 
that awarding a monopoly restricts competition, leads to higher prices, 
curtails new inventions, and favors large corporations at the expense of 
individual inventors or small companies. Despite these concerns, the 
patent system is well established and is here to stay. Moreover, patent own¬ 
ership does not appear to prevent significant research and development by 
other researchers and companies. Indeed, it might be argued that if patents 
were serious impediments to innovation, then U.S. patent 4,237,224, which 
was granted to Stanley Cohen and Herbert Boyer in 1980 for recombinant 
DNA technology for both the use of viral and plasmid vectors and the 
cloning of foreign genes, should have seriously constrained the develop¬ 
ment of recombinant DNA technology (Fig. 22.2). Obviously, no such hin¬ 
drance has occurred. 

In the past, patenting and patent enforcement were rarely of interest to 
academic researchers working in the biological sciences. Now, however, 
there is a view within the academic scientific community that patents and 
the consequences of patenting may be detrimental to established scientific 
values. Traditionally, science, especially university-based research, has 
been an open system with a free exchange of ideas and materials through 
publications and personal communications. The ideas of others have been 
respected, and contribution to the technical development of an area of 
study has, in many instances, been a shared enterprise. However, more 
recently, some scientists have begun to feel that the integrity of traditional 
scientific inquiry has become secondary to self-interest, in that public rec¬ 
ognition and financial gain from innovations are the prime motivations for 
conducting scientific research. It is argued that research is often carried out 
secretly and has created elite, noncooperating research groups. In the past. 
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A method for replicating a biologically functional DNA, which comprises: 
transforming under transforming conditions compatible unicellular organisms 
with biologically functional DNA to form transformants; said biologically 
functional DNA prepared in vitro by the method of: (a) cleaving a viral or circular 
plasmid DNA compatible with said unicellular organism to provide a first linear 
segment having an intact replicon and termini of a predetermined character; (b) 
combining said first linear segment with a second linear DNA segment, having at 
least one intact gene and foreign to said unicellular organism and having termini 
ligatable to said termini of said first linear segment, wherein at least one of said 
first and second linear DNA segments has a gene for a phenotypical trait, under 
joining conditions where the termini of said first and second segments join to 
provide a functional DNA capable of replication and transcription in said 
unicellular organism; growing said unicellular organisms under appropriate 
nutrient conditions; and isolating said transformants from parent unicellular 
organisms by means of said phenotypical trait imparted by said biologically 
functional DNA. 


FIGURE 22.2 The first claim of U.S. patent 4,237,224, granted to S. Cohen and H. 
Boyer on 2 December 1980 and entitled "Process for producing biologically func¬ 
tional molecular chimeras." 


there was a tendency to avoid secrecy in basic research. The belief was that 
scientific knowledge would grow if research results were published as 
articles in journals that could be read by anyone, thereby enabling 
researchers to direct their studies in appropriate directions and to benefit 
from the discoveries of others. With secrecy, time and effort may be wasted 
on repeating experiments that, unbeknownst to the researcher, have 
already been done. Now, scientists are advised by patent lawyers to keep 
their work secret until a patent is filed. Consequently, the lure of patenting 
has made a large number of scientists reluctant to talk about their work, at 
least until after the patent application has been filed. 

Furthermore, because of chronic financial constraints, nonprofit insti¬ 
tutions, and especially universities, have sought additional forms of rev¬ 
enue. Licensing fees and royalties from patents can be sources of new 
income. One example is the Cohen-Boyer patent for recombinant DNA, 
which, during its lifetime from 1980 to 1997, earned about $45 million for 
Stanford University and the University of California. Also, the Massachusetts 
Institute of Technology files more than 100 patents annually in all research 
fields and generates about $5.5 million per year from licensing patent 
rights. Most universities have established patent policies and offices that 
facilitate both patenting and the transfer of technology, at a price, to 
industry. Faculty members usually receive a portion of the income from 
their inventions. Clearly entrepreneurial activity is a fact of life at many 
universities. The challenge is to prevent this legitimate function from 
dominating all aspects of academia. 

In sum, the enthusiasm for patenting and patent protection has elic¬ 
ited the perception that traditional science may become hostage to patent 
holders and that research will become less fruitful. Others feel that the 
traditional way of doing science is an outmoded, inefficient, and indulgent 
exercise and that patent ownership and the drive for ownership will spur 
new discoveries. This controversy will not be readily resolved. It is clear 
that the emergence of molecular biotechnology has raised far-reaching 
considerations, even including how scientific inquiries ought to be con¬ 
ducted. 
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SUMMARY 


S ignificant technological advances, such as molecular bio¬ 
technology, are seldom implemented without controversy. 
The issues and concerns raised by the ability of scientists to 
genetically engineer organisms have had far-reaching implica¬ 
tions and have resulted in the establishment of official guide¬ 
lines to ensure that the introduction of recombinant DNA 
products into the marketplace does not adversely affect 
human or animal health or the environment. In this chapter, 
various aspects of the regulation of recombinant DNA tech¬ 
nology, food products from genetically modified organisms, 
and the release of genetically modified organisms into the 
environment are discussed. 

Guidelines for the proper conduct of recombinant DNA 
technology experiments were established by the NIH in the 
late 1970s and were fine-tuned in the early 1980s, but two 
unresolved issues remained. First, how was the commercial¬ 
ization of genetically engineered products to be regulated? 
Second, how was the deliberate release of genetically modi¬ 
fied organisms into the environment to be managed? Industry, 
believing that no special regulations should be implemented 
for genetically engineered products, has taken the view that 
the nature of the product and its properties, not the process 
that was used to manufacture the product, are what matter. 
This view was adopted in the United States for pharmaceu¬ 
tical products. On the other hand, there has been more con¬ 
cern about genetically engineered products that are consumed 
by humans and animals, including the food ingredients pro¬ 
duced by genetically engineered organisms and transgenic 
crops and livestock. Generally, the FDA, which is responsible 
for ensuring the safety of pharmaceuticals and food products, 
has taken a case-by-case approach to the problem of accepting 
genetically engineered products as safe to eat and produce. 
Depending on the product, a specific set of criteria must be 
met before it can be released for human consumption. 

Guidelines have also been developed for the release of 
genetically modified organisms into the environment. For 
transgenic plants grown in open fields, the potential for unin¬ 
tended impacts on other organisms in the cultivation area. 


such as insects, or in adjacent fields, such as weed plants, must 
be evaluated. This is also the case for genetically engineered 
animals that may escape from contained facilities, such as 
pastures or aquaculture pens. 

Companies that produce biotechnology products often 
invest a great deal of time and resources to develop the prod¬ 
ucts to the commercialization stage. Patents are a means to 
protect their investment by giving the patent holder exclusive 
rights to make, use, or sell the product for a specific period of 
time. These rights are a reward for developing a procedure, 
compound, or apparatus and are intended to spur innovation. 
The public also benefits from the disclosure of the details of an 
invention, knowledge that will both prevent loss of time and 
energy in the pursuit of something that has already been 
invented and stimulate further research. For a patent to be 
granted, an invention must be novel, not obvious, and useful. 
In addition, the invention should not be a "product of 
nature." 

The key case that established that genetically engineered 
microorganisms were patentable was brought forward by A. 
Chakrabarty. In 1980, the U.S. Supreme Court ruled that a 
bacterium that had been created by a form of genetic manipu¬ 
lation could be patented. As a result of this landmark decision, 
U.S. patents have been granted for genetically modified plants 
and animals. Moreover, after considerable debate, transgenic 
animals and plants are also patentable in most countries in the 
world. 

With the development of molecular biotechnology, ques¬ 
tions about whether private industries should be allowed to 
own or patent genetically engineered organisms have been 
raised. On one hand, it has been argued that without such 
proprietary rights, biotechnology companies would not have 
the incentive to develop and market novel products. On the 
other hand, some critics find this type of privilege to be mor¬ 
ally unacceptable, and they contend that patents of this sort 
inhibit research and constrain innovation. For a variety of 
reasons, patenting has also had an impact on how university- 
based biotechnology research is conducted. 
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REVIEW QUESTIONS 


1. What is the RAC? What was its role in regulating recombi¬ 
nant DNA research? 

2. What criteria are used by the FDA to determine if a recom¬ 
binant protein is acceptable as a food or food additive? 

3. Why are genetically engineered microorganisms that are 
designed to be released into the environment regulated? 


4. Present an argument for or against the ban on production 
of L-tryptophan in a genetically engineered bacterium for 
human consumption. 

5. Discuss the positive and negative aspects of licensing 
recombinant bovine somatotropin. 
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6. What are some of the criteria that regulatory agencies con¬ 
sider in approving transgenic crops? 

7. Why is it important to evaluate transgenic livestock on a 
case-by-case basis? 

8. What are the essential requirements for patenting an inven¬ 
tion? 

9. What are process patents? Product patents? Give exam¬ 
ples. 

10. What information must be present in a patent applica¬ 
tion? 


11. Why is a patent helpful to researchers who do not hold the 
patent? 

12. What is the purpose of a patent? 

13. Prepare arguments for both sides of the following debate: 
"Resolved: patenting of genetically engineered multicellular 
organisms should be banned." 

14. Go to the U.S. PTO website (http://patents.uspto.gov/) 
and conduct a search for biotechnology patents. Use various 
combinations of search words, such as transgenic AND mouse 
or DNA AND diagnostic. Summarize the inventions in the 
five most recent patents. 



' Safety of 
Consuming Genetically Modified 
Foods 

Alteration of the Nutritional Content 
of Food 

Potential for Introducing Toxins or 

Allergens into Food 

Potential for Transferring Transgenes 

from Food to Humans or Intestinal 

Microorganisms 

Controversy about the Labeling of 
Genetically Modified Foods 

Concerns about the Impact of 
Genetically Modified Organisms 
on the Environment 

Impact on Biodiversity 
: Impact of the Bt Toxin on Nontarget 
Insects 

Environmental Benefits of Genetically 
Modified Organisms 

Economic Issues 

Who Benefits from Molecular 
Biotechnology? 

How Do Views about Genetically 
Engineered Food Impact Trade? 

SUMMARY 
REFERENCES 
REVIEW QUESTIONS 


Societal Issues in 
Biotechnology 


N o technology is without risks. However, it is important to weigh 
the benefits of a technological development against the risks and 
to manage the risks in a responsible and informed manner. Over 
the last 30 years, recombinant DNA technology has provided many com¬ 
mercial products, described throughout this book, that benefit society, and 
after much scientific scrutiny, most have proven to be safe. For the most 
part, society has accepted the technology. Vaccines and other medicines 
developed using recombinant DNA technology are generally accepted by 
the public and considered to be necessary and at least as safe as nonrecom¬ 
binant medicines. On the other hand, food products derived from trans¬ 
genic plants and animals make some people uneasy. They question whether 
the introduction of a transgene may make the food toxic, allergenic, or less 
nutritious or whether the transgenes can be transferred to other gut organ¬ 
isms, or perhaps even to the consumer, during digestion. Furthermore, 
some worry that once a transgene is released from the laboratory environ¬ 
ment, in a cultivated plant or a farmed animal, it will have unintended, 
harmful consequences that cannot be foreseen or controlled. Some oppo¬ 
nents of molecular biotechnology will never be convinced that the products 
are safe. For them, the transfer of genes among unrelated organisms is 
unnatural and therefore inherently wrong. Others want assurance that the 
risks are minimal and are justified because the products are necessary and 
that society in general, rather than a few select interest groups, will benefit 
from their availability. 


Concerns about the Safety of Consuming Genetically 
Modified Foods 

All foods produced today are derived from plants and animals that have 
been genetically manipulated to enhance desirable characteristics through 
either selective breeding or recombinant DNA technology. The genetic 
changes introduced through recombinant DNA technology are vastly 
smaller and better understood than those from traditional breeding, as 
only a few well-characterized genes are introduced. Also, because known 
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regulatory elements are included on the introduced genetic constructs, the 
spatial and temporal expression of recombinant genes can be controlled. 
Despite these advantages, some consumers are concerned that manipu¬ 
lating the genomes of food plants and animals through recombinant DNA 
technology can lead to the production of foods that are unsafe for human 
consumption. This fear has had a powerful effect on the food industry. For 
example, consumer concern over the safety of transgenic foods led major 
baby food producers to stop using them in their products, and several fast 
food chains have removed them from their menus. On the other hand, 
proponents of genetic engineering argue that by rejecting transgenic ingre¬ 
dients producers are marketing foods that are less safe. Conventional food 
is more likely to be contaminated with potent mycotoxins produced by 
molds, which could be prevented by growing crops that are engineered to 
be mold resistant. Allergens, naturally present in many foods, could be 
removed through genetic manipulation. 

Alteration of the Nutritional Content of Food 

All genetically engineered foods derived from plants and animals are 
tested on a case-by-case basis for "substantial equivalence" before approval 
for commercialization by regulatory agencies. This means that the safety 
and nutritional content of genetically engineered food must be substan- 


TABLE 23.1 Comparison of nutritional content of glyphosate-tolerant transgenic corn grain 
(Roundup Ready corn line GA21) with that of the nontransgenic parental control line 


Nutrient 

Transgenic 

corn 

Nontransgenic 

corn 

Range (avg) for 
conventional corn 

Proximates (% dry weight) 

Protein 

11.05 

10.54 

6.67-14.69 (10.18) 

Fat 

3.90 

3.98 

2.03-1.90 (3.48) 

Carbohydrates 

83.7 

83.8 

77.4-89.5 (84.8) 

Fatty acids (% total fatty acids) 

Palmitic (16:0) 

10.70 

10.72 

8.57-17.46 (11.87) 

Stearic (18:0) 

1.68 

1.67 

1.02-2.86 (1.94) 

Oleic (18:1) 

24.2 

24.1 

17.4-38.5 (25.6) 

Linoleic (18:2) 

61.4 

61.5 

47.7-64.2 (57.5) 

Minerals (% dry weight) 

Phosphorus 

0.326 

0.326 

0.160-0.533 (0.321) 

Calcium 

0.0039 

0.0043 

0.0022-0.0163 (0.0043) 

Amino acids (% total amino acids) 

Methionine 

2.16 

2.17 

(2.05) 

Cysteine 

2.22 

2.28 

(2.12) 

Lysine 

3.11 

3.02 

(3.09) 

Leucine 

12.98 

12.87 

(12.91) 

Tryptophan 

0.61 

0.61 

(0.62) 

Phenylalanine 

5.31 

5.33 

(5.10) 


Adapted from Chassy et al.. Comp. Rev. Food Sci. Food Saf. 3:35-104, 2004. 

Only some of the nutrients measured are listed. Also shown are the ranges of nutrient values found in conventional 
corn grains in the United States. The values in the rightmost column were provided by the International Life Sciences 
Institute Crop Composition database (http://www.cropcomposition.org). Note: the ranges for individual amino acids 
could not be calculated from available data. 
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tially the same as those of the corresponding conventional food. The nutri¬ 
ents that are measured include carbohydrates, proteins, fats, amino acids, 
fatty acids, vitamins, and minerals (Table 23.1). The natural variation in 
nutritional content among the conventional varieties of the food product 
due to varietal, developmental, or environmental factors is taken into con¬ 
sideration, and therefore, there is usually a range of levels that are consid¬ 
ered acceptable for a given food. Also considered are any changes in 
antinutritional factors that occur naturally in foods. These include chela¬ 
tors, such as phytic acid, which effectively remove calcium, iron, zinc, and 
magnesium; protease inhibitors, which prevent protein digestion; and lec¬ 
tins, which reduce the bioavailability of carbohydrates. The measurements 
are used to determine whether undesirable changes in nutrient content 
have occurred as a consequence of genetic manipulation, that is, whether 
levels of important nutrients are reduced or levels of potentially harmful 
compounds have increased. 

Numerous feeding trials in which insect-resistant or herbicide-tolerant 
transgenic corn (maize), rice, potatoes, soybeans, or tomatoes were fed to 
laboratory animals or livestock for prolonged periods, and often for several 
generations, have found no adverse effects related to nutrient deficiencies. 
There were no significant differences in the composition, quality, or digest¬ 
ibility of the food or in the development, health, or performance of animals 
fed genetically engineered and conventional plants. For example, more 
than 20 studies have shown that the yield and quality (fat, lactose, and 
protein content) of milk produced by lactating dairy cows fed corn geneti¬ 
cally engineered to be insect resistant or corn, soybeans, or beets genetically 
modified to be tolerant of the herbicide glyphosate are the same as those of 
milk from cows fed a nontransgenic diet. 

Detailed analyses of the compositions of products derived from cloned 
animals has led the U.S. Food and Drug Administration (FDA) Center for 
Veterinary Medicine to conclude that there is no difference between the 
nutritional contents of meat or milk from clones and conventionally bred 
animals. These studies measured milk yields and milk and meat fat, pro¬ 
tein, and carbohydrate contents. The amounts of amino acids, fatty acids, 
and important vitamins and minerals were also determined. It is acknowl¬ 
edged that an exhaustive analysis of tissue composition is impossible due 
to the complexity of the molecules present and that composition is impacted 
appreciably by the diet of the animal and its environment. 

Several plants have been engineered to improve the nutritional value of 
food (Table 23.2). In these cases, the nutritional content of the genetically 
engineered food is intentionally made different from that of the convention¬ 
ally bred plant and in this regard does not meet the criteria for substantial 
equivalence. An important example of a nutritionally enhanced food is 
"golden rice," which was created to address dietary deficiencies of vitamin 
A. Insufficient consumption of vitamin A is a significant problem in devel¬ 
oping nations, where each year hundreds of thousands of children are 
blinded due to retinal and corneal damage, suffer from infectious diseases, 
or die as a consequence of this nutritional deficiency. Carrots, tomatoes, 
meat, and milk are good dietary sources of vitamin A; however, in devel¬ 
oping countries, they are not always readily available. Rather, rice grains, 
which do not contain vitamin A, are a primary food source. In 2000, after 7 
years of research, scientists successfully engineered rice to produce (3-carotene, 
a biosynthetic precursor of vitamin A, by introducing genes from daffodil 
and the bacterium Erwinia uredovora. In the initial strain, the levels of 
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TABLE 23.2 Some genetically modified plants with enhanced nutritional value 


Plant 

Enhanced trait 

Transgene product 

Canola 

Increased vitamin E 

y-Tocopherol methyl transferase 

Canola 

Increased y-linolenic acid 
(omega-6 fatty acid) 

A 6 and A 12 desaturases 

Canola 

Added (3-carotene (vitamin A 
precursor) 

Phytoene synthase, phytoene 
desaturase, lycopene cyclase 

Cassava 

Decreased cyanogenic toxins 

Hydroxynitrile lyase 

Coffee 

Decreased caffeine 

Antisense xanthosine-N-7- 
methyltransferase 

Corn 

Increased vitamin C 

Dehydroascorbate reductase 

Corn 

Increased iron 

Ferritin, phytase 

Potato 

Decreased solanine 
(glycoalkaloid toxin) 

Antisense sterol glycotransferase 

Rice 

Added (3-carotene 

Phytoene synthase, phytoene 
desaturase, lycopene cyclase 

Rice 

Increased iron 

Ferritin, metallothionein, phytase 

Tomato 

Increased (3-carotene and 
lycopene 

Lycopene cyclase, phytoene 
desaturase 

Tomato 

Increased flavonoids 
(antioxidants) 

Chalcone isomerase 


Adapted from EFSA GMO Panel Working Group on Animal Feeding Traits, Food Chem. Toxicol. 46:S2- 
S70, 2008. 


(3-carotene were too low to provide the recommended amounts through diet, 
a problem highlighted by opponents of the genetically engineered rice. 
However, a subsequent version, announced in 2004, produced 20 times 
more (3-carotene, thereby reducing the amount of rice that would need to be 
eaten in order to obtain sufficient vitamin A to stave off malnutrition. 

Despite the promise of this nutritionally enhanced food to solve 
vitamin A malnutrition, almost a decade after it was first developed, 
golden rice is not yet available commercially. Among other extensive envi¬ 
ronmental and biosafety analyses, golden rice, as for other nutritionally 
enhanced foods, must be assessed for substantial equivalence in nutritional 
composition to nontransgenic rice, except for the compound that is inten¬ 
tionally altered (i.e., (3-carotene). Where all other compositional qualities 
are similar to those of a nonengineered comparator plant, the safety of the 
altered nutrient levels is evaluated against well-established nutritional 
guidelines. The large amounts of golden rice required for a complete safety 
assessment have been difficult to obtain in the greenhouse and in the lim¬ 
ited field trials approved so far. Lack of financial support for a product that 
is being developed for low-income consumers and aggressive opposition 
to nutritionally enhanced food developed through genetic engineering 
have delayed completion of the requirements for regulatory approval. 
Opponents argue that rice containing the vitamin A precursor is unneces¬ 
sarily risky because other methods to deliver vitamin A are available, 
including vitamin tablets and vitamin-fortified foods, such as sugar, and 
growing vegetables rich in vitamin A. However, these methods are often 
costly and difficult to sustain. Despite the obstacles, holders of the various 
patented technologies that contributed to the development of golden rice 
have provided licenses to developing countries at no charge in anticipation 
of its eventual commercialization (expected in 2011) as a humanitarian ges¬ 
ture to improve the health of billions of people. 
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Potential for Introducing Toxins or Allergens into Food 

It is important to bear in mind that no foods can be guaranteed to be 100% 
safe. Many foods contain trace amounts of natural toxins that are harmless 
when consumed at low levels. For example, two common food plants, the 
potato and tomato, are members of the nightshade (Solanaceae) family that 
produce glycoalkaloids that can cause serious illness when consumed in 
relatively large quantities. Also, many foods naturally contain proteins that 
elicit an allergic response in some consumers. Extensive health safety tests 
of animals that compare genetically engineered foods to their nonengi- 
neered counterparts have concluded that the process of creating genetically 
engineered foods does not make the food different in digestibility or detri¬ 
mental to the health of the consuming animal. However, it is necessary to 
ensure that the introduction of a specific transgene into a food product does 
not increase the risk of producing a toxin or allergen (Table 23.3). The regu¬ 
latory agencies consider each food product derived from a transgenic plant 
or animal on a case-by-case basis and assess the risks against those associ¬ 
ated with consuming the corresponding nontransgenic product. 

There have been some highly publicized reports of adverse effects that 
have propagated unease among some consumers about eating genetically 
engineered foods. Often, the reports in the popular press have focused on 
particular aspects of the story that have led to confusion or have failed to 
mention that the data do not hold up to scientific scrutiny. For example, in 
August 1998, Arpad Pusztai, a scientist at the Rowett Research Institute in 
Aberdeen, Scotland, announced on a British television program that rats 
fed transgenic potatoes for 110 days were stunted and had suppressed 
immune function. He did not mention that the transgene encoded a plant 
lectin, a carbohydrate-binding protein. He also failed to emphasize that the 
experiments were preliminary and were being conducted to determine if 
this particular genetically modified plant was actually safe. Pusztai's rev¬ 
elation became instant news with the focus of attention on transgenic 
plants in general and not the specific gene that had been introduced into 
potatoes in this instance. The incident was further complicated when 
Pusztai's data were found by the director of his institute to be deficient 
scientifically. Eventually, after much controversy, Pusztai and an associate 
published a study suggesting that rats fed a diet consisting solely of trans¬ 
genic potatoes engineered to express a plant lectin gene had compromised 
immune systems. Independent analysis of this report by a committee of the 
Royal Society in Britain and the National Institute for Quality Control of 
Agricultural Products in the Netherlands found serious scientific short¬ 
comings that brought into question the validity of the results. Despite these 
expert assessments, this transgenic-potato saga was used by the critics of 
biotechnology as proof that transgenic plants are inherently dangerous to 
humans. 

A transgene that is commonly introduced into crops encodes the crystal 
(Cry) protein, an insecticidal toxin, from the bacterium Bacillus thuringiensis 
(Bt toxin). When ingested by target insect larvae, the toxin interacts with 
specific receptors in the epithelium of the larval gut. The protein inserts 
into the epithelium, forming a pore through which gut contents leak, 
leading to the death of the insect. The Bt toxin has been used safely for over 
40 years to protect crops from a variety of insect pests, either as an agricul¬ 
tural spray or produced by transgenic plants. The Cry protein is produced 
by B. thuringiensis in an inactive form as a full-length protein. Proteases 
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TABLE 23.3 Some examples of animal feeding studies conducted to assess the potential toxicity of genetically 
modified (GM) foods 


GM food 

Transgene 

Test animal 

Duration of 
feeding 

Health effects 

Corn 

epsps 

Rat 

13 weeks 

No adverse effects; animals were similar in overall 
health, body weight, food consumption, organ 
weights, blood and urine chemistry, and tissue 
appearance 

Corn 

crylAb 

Chicken 

38 days 

No significant difference in survival and body 
weight 

Corn 

crylAb 

Pig 

91 days 

No difference in nutrient digestibility and energy 
content 

Soybean 

epsps 

Rat 

13 weeks 

No difference in animal activity, body weight, 
blood chemistry, and urine chemistry; no gross 
abnormalities 

Soybean 

epsps 

Mouse 

Gestation and 
lactation 

No difference in litter size, body weight of pups, 
and testicular development 

Potato 

bar 

Rat 

10 weeks 

No significant difference in body or organ weights, 
food consumption, sperm motility, litter size, 
survival and body weight of pups; significant 
reduction in male and female fertility 

Adapted from Domingo, Crit. Rev. Food Sci. Nutr. 47:721-733, 2007. 



The health effects are compared to those in animals fed non-genetically modified food, epsps confers tolerance for glyphosate herbicide, crylAb 
confers resistance to lepidopteran insects, and bar confers tolerance for glyfosinate herbicide. 


produced specifically by the insect cleave the protein, which actives the 
toxin. The proteases are active in the slightly alkaline environment of the 
insect gut. The protein is safe for human consumption because humans do 
not produce the specific protease required to activate the toxin or the epi¬ 
thelial receptors that bind the toxin, and the toxin is rapidly degraded 
under the acidic conditions found in the mammalian digestive system. 
Nonetheless, some consumers have questioned whether the Cry protein 
may be a potential allergen. This concern is fueled in part by reports that 
are often based on misinterpreted scientific studies. For example, in a 2005 
campaign against Bt rice, Greenpeace stated that rice genetically engi¬ 
neered to produce the CrylAc protein "could cause an allergic reaction, as 
it did when tested on mice." Several scientific studies were cited to support 
the claim that the protein elicited an allergic reaction in mice. However, it 
was not explained that the purpose of the cited studies was to test the 
CrylAc protein as an adjuvant to increase the efficacy of a vaccine or that 
the CrylAc protein was chosen because it was known to have low toxicity 
in vertebrates. While anti-Cryl Ac antibodies typical of an immune response 
were produced in the immunized mice, they were not immunoglobulin E 
antibodies involved in the allergic response, nor was evidence of an allergic 
reaction presented. It was also not mentioned that CrylAc was already 
widely produced in crops consumed by humans and other animals without 
harmful effects. Interestingly, there has been almost no objection to the 
spraying of the Bt toxin over the last 4 decades; opposition seems only to 
be voiced against its production in transgenic plants. 

To further support the allergenicity of CrylAc, opponents made refer¬ 
ence to concerns among regulators regarding the potential allergenicity of 
another Bt product known as Star Link corn, an unfortunate association due 
to a regulatory breach by StarLink. The B. thuringiensis gene used to gen- 
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erate StarLink com encodes the Cry9C protein, which differs from CrylAc 
and other Bt toxins. Although Cry9C is not similar to known protein aller¬ 
gens and is not derived from an organism known to produce allergenic 
proteins, it has greater heat stability and is not as readily digested, which 
are characteristics of some allergenic proteins. For these reasons, the FDA 
did not approve StarLink for human use but did allow it to be sold as 
animal feed and used for industrial purposes. Flowever, in July 2000, Larry 
Bohlen, director of the Community, Health, and Environment Program for 
the U.S. affiliate of the international environmental organization Friends of 
the Earth, went grocery shopping and bought a large number of products 
that contained corn flour, including taco shells, corn flakes, and muffin 
mixes. He had these tested for the StarLink Cry9C gene using the poly¬ 
merase chain reaction (PCR) test, and evidence of the gene was found in 
some of the taco shells. Clearly, food safety regulations had been broken. It 
was estimated that StarLink contaminated, at most, about 0.125% of all 
corn-based foods. The source of the contamination has never been traced. 
Aventis, the company that developed StarLink, has assumed that some 
farmers, probably inadvertently mixed it with other varieties of com 
despite written agreements that committed them to keep their harvested Bt 
crops segregated. The seed is no longer sold. 

The StarLink episode had a number of ramifications. The critics of bio¬ 
technology pointed out that the regulatory system was seriously flawed 
and stressed that consumers should be aware of the hazards of genetically 
modified organisms. The biotechnology industry representatives noted 
there was no public health concern but that the regulatory issue had to be 
corrected. The FDA declared that it would no longer allow a product to be 
approved only for animal feed. A number of scientific advisory panels were 
convened to study in detail the consequences of Cry9C in human food, the 
possibility of consumer maladies due to eating products that may have 
been derived from StarLink, and how to improve the regulation and sur¬ 
veillance of transgenic crops. By 2001, there were no definitive cases of 
allergic reactions to Cry9C, but individuals who had symptoms compatible 
with a possible newly acquired allergy were monitored. No serious illness 
has been documented as a result of eating food contaminated with 
StarLink. 

The potential for introduction of allergenic proteins into a food product 
that does not otherwise elicit an allergic response is a major concern. The 
FDA requires labeling to indicate genetically engineered foods containing 
proteins from organisms known to produce proteins that are allergens to 
humans. Major allergens are proteins from eggs, milk, shellfish, fish, tree 
nuts, soybeans, wheat, and peanuts. In most cases, such transgenic food 
products are not commercialized. For example, soybeans engineered to 
express a protein from Brazil nuts that was intended to increase the methi¬ 
onine content of the soybeans were found to react with sera from individ¬ 
uals known to be allergic to Brazil nuts. As a consequence, further 
development of the transgenic soybeans was voluntarily terminated by the 
developer. 

Food allergies are a significant problem in developed countries. In the 
United States, about 5 to 8% of children and 1 to 2% of adults have food 
allergies, some of which can be fatal. Molecular biotechnology has the 
potential to reduce the allergenicity of some foods by preventing the syn¬ 
thesis of the allergenic protein (Table 23.4). A hypoallergenic variety of 
peanut is currently under development using RNA interference to reduce 


930 


CHAPTER 23 


levels of the seed storage protein Ara h 2, the most potent peanut allergen 
to which most hypersensitive individuals respond, often fatally. 

Potential for Transferring Transgenes from Food to Humans or 
Intestinal Microorganisms 

We consume a large amount of DNA, and the encoded protein products, in 
the plant and animal tissues that make up our daily diet. Indeed, these 
supply part of the nutrition from food. It has been estimated that an 
average adult consumes approximately 0.1 to 1 g of DNA per day. The 
DNA is partially degraded during food processing at high temperature or 
low pH and is further degraded into small fragments through chewing and 
the activities of nucleases in saliva and the gastrointestinal tract. Only a 
very small amount of the total DNA consumed remains in fragments that 
are capable of carrying an intact gene as the digested DNA passes into the 
small intestine. This was shown in patients who had an ileostomy, an 
operation in which the upper portion of the intestinal tract, the ileum, is 
severed from the lower portion of the small intestine, which allowed 
researchers to collect digestion products before they completed passage 
through the small intestine and colon. After eating soybeans genetically 
engineered with the epsps gene (encoding 5-enolpyruvylshikimate-3-phos- 
phate synthase, which confers tolerance of the herbicide glyphosate), up to 
3.7% of the consumed transgene was detected by PCR as a 180-base-pair 
fragment in the digestion products of the ileostomists. The full-length epsps 
gene (2.27 kilobase pairs) was detected in many of the collected samples. 
However, the epsps gene fragments were not detected in the feces of healthy 
volunteers with intact intestinal tracts and therefore were completely 
digested in the small and large intestines. 

If the genes that are present in the food we eat remain intact in our 
intestinal tracts, can they be incorporated into our genomes, where they 
could disrupt the function of a gene or where they could be expressed, 
resulting in the production of a foreign protein? Several studies have 
shown that neither transgenic DNA nor the recombinant proteins encoded 
by the transgenes present in food are found in the tissues of humans and 
livestock that consume the food. PCR and Southern hybridization failed to 
detect recombinant DNA in the milk, eggs, skin, muscle, and other tissues 
of several livestock animals fed genetically engineered crops. 
Nonrecombinant chloroplast DNA, which is naturally present in multiple 
copies in plant cells, was detected in some tissues. This may have implica¬ 
tions for transgenes that are incorporated into the chloroplast genome. 


TABLE 23.4 Some foods containing allergenic proteins that may be reduced through 
genetic engineering 


Food plant 

Allergenic protein 

Strategy to reduce allergenic protein 

Rice 

14-16-kilodalton allergen 

Antisense RA17 gene silencing 

Soybean 

Gly m bd 30 K (P34; 
cysteine protease) 

P34 sense cosuppression 

Apple 

Mai d 1 

Mai d 1 RNA interference 

Tomato 

Lyc e 1 (profillin) 

Lyc e 1 RNA interference 

Tomato 

Lyc e 3 (lipid transfer 
protein [LTP]) 

LTP RNA interference 

Peanut 

Ara h 2 (conglutin 7) 

Ara h 2 RNA interference 
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Even when present in tissues, integration of functional genes from food 
DNA, transgenic or otherwise, into the genome of the consuming animal 
has never been found and is likely restricted by (1) the small fragment sizes 
of the DNA remaining in the gastrointestinal tract following food digestion, 
(2) the presence of cytosolic nucleases that further digest foreign DNA that 
is taken up by intestinal epithelial cells, (3) insufficient sequence homology 
for integration by homologous recombination, and (4) the requirement for 
appropriate transcription and translation signals for expression of an inte¬ 
grated gene. 

The possibility that antibiotic resistance genes used as selectable 
markers during the process of creating transgenic plants could be trans¬ 
ferred to microorganisms in the intestinal tract has also been investigated. 
Because most of the DNA derived from food that reaches the intestines, 
and particularly the colon, where most gut microorganisms reside, has 
been digested into small fragments, the risk of an intestinal microbe taking 
up and expressing a functional antibiotic resistance gene is extremely low. 
For example, an antibiotic resistance gene in transgenic corn was undetect¬ 
able after 1 minute in sheep rumen fluid and in the intestines of chickens. 
Although there is no evidence from long-term studies of a variety of ani¬ 
mals that antibiotic resistance genes consumed in transgenic food have 
been transferred to intestinal bacteria, to alleviate public concerns, alter¬ 
nate selection methods that avoid antibiotic resistance genes have been 
developed. 

Controversy about the Labeling of Genetically Modified Foods 

The labeling of biotechnology-derived foods is a contentious issue. In the 
United States, the composition of the product, not the process by which the 
product is produced, determines whether specific information should be 
added to a label. For example, corn syrup derived from Bt corn is identical 
to that from conventional corn. Consequently, in the United States, this 
type of product does not require a special label. In other words, the FDA 
policy follows the dictum, "If it quacks like a duck, walks like a duck, and 
looks like a duck, then it must be a duck." Labeling is required if the 
nutrient content of a genetically modified food is substantially different 
from that of the traditional product; if the food is novel, i.e., it never has 
been produced before; if it is likely to contain a potential allergen; or if it 
has an increased level of a toxic agent. In Europe, in contrast, the process 
has precedence over the product. Accordingly, it is mandatory for all 
European Union countries to label any food as "genetically modified" if 
more than 0.9% is from a genetically modified organism. The genetically 
modified food must also be traceable from its source on the farm through 
all stages of processing, storage, and transportation. This is to facilitate 
removal of the product should it be found to have adverse effects. Australia 
and New Zealand require the same designation if "novel DNA and/or 
protein is present in the final product," but not if the food is highly refined 
or there are no genetically modified ingredients in the final product. In 
most cases, the stringency of the labeling is not solely dependent on health 
concerns; often, economic and philosophical issues are also important con¬ 
siderations. 

Historically the food industry in the United States has argued against 
labeling genetically modified foods because it will be perceived as a warning 
and will unnecessarily stigmatize the product. On the other hand, as a mar- 
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keting strategy, some food companies and retail outlets are specifically 
labeling products that do not have any genetically modified ingredients. 
Since a label cannot be misleading or untruthful, the FDA has objected to 
descriptions such as "no genetically modified organisms" or "genetically 
modified free," because in the former case the product does not contain a 
viable organism and in the latter case there is no guarantee that some con¬ 
tamination has not occurred. Terms such as "genetically engineered" and 
"made through biotechnology" are acceptable to the FDA when there is no 
need to specify nutritional or health requirements. Consumer advocate 
groups in the United States have argued that the consumer has a right to 
know if a product comes from a genetically modified organism regardless 
of an equivalence of properties. Although legislation supporting this prin¬ 
ciple has been drafted by members of the U.S. Senate and blouse of 
Representatives, none of these proposals have had much success. 


Concerns about the Impact of Genetically Modified 
Organisms on the Environment 

There are currently more than 6.7 billion people on Earth, and the popula¬ 
tion is projected to grow to 9 billion in about 30 years. This is an enormous 
number of people to feed with an agricultural system that is already bur¬ 
dened by crop reductions due to disease, pests, adverse weather (drought, 
flooding, and hurricanes), competition with nonfood crops, and soil quality 
deterioration due to overuse of agricultural chemicals. Production of trans¬ 
genic crops can offer solutions to some of these environmental problems by 
increasing the resistance of important crop plants to infectious disease, 
insect predation, and climate change and reducing applications of pesti¬ 
cides. However, some critics believe that our increasing dependence on 
genetically engineered crops is adversely affecting the environment by 
decreasing biodiversity and harming unintended organisms. 

Impact on Biodiversity 

In general, agriculture has decreased biodiversity. Crop plants are selected 
for high yields of their edible parts, hardiness, and other traits at the cost of 
loss of richness, not only of cultivated species, but also of wild plants as 
cultivation has expanded over a greater area. Newer, higher-yielding vari¬ 
eties have tended to replace traditional crops. The diversity of insects, 
birds, and other animals has also decreased as more land is dedicated to 
agriculture and host and food plants are lost. Application of agricultural 
chemicals to eradicate weeds and pests is a contributing factor. There is 
some concern that the trend toward extensive cultivation of crops that have 
been genetically engineered to resist environmental factors that normally 
limit plant growth, i.e., herbicide tolerance and insect resistance, could 
further decrease biodiversity. This concern stems mainly from the possi¬ 
bility that the transgenic plants could become weedy or invasive of natural 
habitats or that the transgenes can be transferred from genetically engi¬ 
neered crops to wild relatives or to non-genetically engineered crops and 
could confer on them a selective advantage. Enhanced growth of the unin¬ 
tentionally engineered plants could cause them to be invasive or to out- 
compete plants without the transgene. 

Several species of weeds have been found to be resistant to the herbi¬ 
cide glyphosate. For the most part, this is due to overuse of the herbicide. 
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which selects for natural mutations that increase resistance in weeds, rather 
than to the transfer of the transgene through cross-breeding (hybridiza¬ 
tion), although the latter has been shown to occur at a low frequency. 
Pollination of wild relatives by genetically engineered crops can occur 
when they are within range of pollen distribution by wind, insects, etc., and 
when pollen production by the genetically engineered crop and flower 
production by the wild relative occur at the same time. Transfer of genes 
among plants that mainly self-pollinate (pollination of a flower by its own 
pollen), such as rice and soybeans, occurs at a low frequency. For example, 
transfer of a herbicide resistance gene from genetically engineered rice to 
wild red rice was found to occur at a frequency of less than 1%. The fre¬ 
quency of transfer is higher for plants that cross-pollinate (the pollen from 
one plant fertilizes the ovule of another plant). The glyphosate resistance 
gene from transgenic canola (Brassica napus), an important oilseed crop, 
was transferred to a wild relative ( Brassica rapa, or field mustard) at a fre¬ 
quency ranging from 7 to 14% depending on the B. rapa population. This 
was determined by collecting seeds from herbicide-sensitive B. rapa plants 
that had been interplanted with herbicide-tolerant transgenic B. napus and 
then assessing herbicide tolerance and production of the specific protein 
encoded by the transgene in the plants that developed from the B. rapa 
seeds. While cross-pollination occurred at a relatively high frequency 
between B. napus and B. rapa, hybridization between B. napus and three 
other wild relatives was extremely rare. 

The spread of herbicide resistance genes reduces the ability of farmers 
to control weeds, and the weedy plants have the potential to become inva¬ 
sive, as they can proliferate in areas where the herbicides are used. 
Herbicide-resistant plants are unlikely to have an advantage in nonagricul- 
tural areas where herbicides are not applied. Moreover, hybridization does 
not always confer an advantage on the hybrid progeny. Hybrids resulting 
from cross-pollination by transgenic rice resistant to glyfosinate (a herbi¬ 
cide that disrupts glutamine biosynthesis) were less fit than their wild-type 
parents because flowering occurred too late to produce seed, which pre¬ 
vented proliferation of the hybrids. Hybrids that express a Bt toxin trans¬ 
gene may have a significant advantage when insect pressure is high. 

The potential for gene flow to nontransgenic cultivars and wild rela¬ 
tives and its consequences are included in risk assessments required by 
regulatory agencies before a genetically engineered crop is commercial¬ 
ized, and strategies to manage the risks must be in place prior to cultiva¬ 
tion. The assessment considers the mechanism by which the plant 
reproduces and pollen is disseminated, the nature of the transgene and any 
selective advantage it may impose, and the geographic context in which 
the crop will be grown, which includes the identification of sexually com¬ 
patible wild relatives. Management strategies can include avoiding cultiva¬ 
tion of a transgenic crop in areas where indigenous wild varieties are found 
and controlling embryo and/or seed viability through genetic engi¬ 
neering. 

Impact of the Bt Toxin on Nontarget Insects 

In May 1999, a report entitled "Transgenic pollen harms monarch larvae" 
by Losey et al. was published in the scientific journal Nature. In this study, 
pollen from Bt corn was sprinkled on milkweed leaves, the sole source of 
food for the larvae of the monarch butterfly, and extremely high mortality 
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was observed after monarch butterfly larvae fed on the treated leaves. The 
authors concluded that extensive Bt crop acreage could have "profound 
implications for the conservation of monarch butterflies." The study, which 
was conducted in the laboratory rather than under natural field conditions, 
incited much criticism. Scientists pointed out that appropriate controls 
were not used, the amount of pollen on the leaves was not calculated, and 
there was no indication whether the time of pollen shedding coincided 
with a feeding period of monarch butterfly larvae. These technical concerns 
were largely ignored because the monarch butterfly has iconic and aes¬ 
thetic status in North America. Consequently, the study was widely 
reported and the threat to the monarch butterfly formed the cornerstone of 
the campaign against biotechnology in general. However, after 2 years of 
detailed studies, the initial skepticism of most entomologists was con¬ 
firmed when it was proved that the risk of Bt toxin toxicity to the larvae of 
the monarch butterfly was negligible. Antibiotechnology factions have 
been reluctant to accept these findings. Much of their literature continues 
to stress that Bt crops are a threat to the survival of the monarch butterfly. 

More recently, reports in the popular media have emerged suggesting 
that pollen containing the Bt toxin was responsible for severe reductions in 
the honeybee population in North America and Europe. Declines in the 
honeybee population, a phenomenon known as colony collapse disorder, 
were occurring at an alarming rate, and the consequences would be very 
serious if pollinators were not available to produce fruits. However, an 
analysis of 25 independent studies concluded that pollen containing the 
Cry proteins that are present in transgenic crops currently under cultiva¬ 
tion and that are effective against coleopteran and lepidopteran predators 
are not toxic to honeybees, a hymenopteran insect. After a great deal of 
investigation, the general consensus among researchers is that because the 
Bt toxins are highly specific for target insects and are confined to plant tis¬ 
sues, the impact on nontarget insects is minimal. Several studies have 
shown that insect diversity and abundance have actually increased with 
the cultivation of Bt crops compared to application of chemical insecticide 
sprays that have a broader target range. 

Environmental Benefits of Genetically Modified Organisms 

Cultivation of plants that produce heterologous proteins that provide pro¬ 
tection from pests has led to a global reduction in the application of agri¬ 
cultural chemicals, many of which have toxic effects on the environment. 
Spraying of insecticides was reduced by 94.5 million kg, or 19.4%, from 
1996 to 2005 due to the cultivation of insect-resistant cotton alone. Insect- 
resistant corn accounted for a further 4.6% reduction. For example, the 
western com rootworm is a devastating and pervasive insect pest of com 
and causes huge economic losses in the United States and elsewhere. 
Because the predator is difficult to control, farmers routinely apply insecti¬ 
cides to their cornfields before infestation by the worm is apparent. The 
cultivation of genetically engineered varieties of com that produce the 
Cry3Bbl toxin, which is effective against the western corn rootworm, have 
obviated the spraying of insecticides against that insect. 

Although some would argue that the widespread cultivation of herbi¬ 
cide-tolerant crops has increased our dependence on herbicides, it has actu¬ 
ally resulted in a 25 to 30% reduction in herbicide applications to transgenic 
crops compared to conventional crops. While the use of glyphosate, the 
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herbicide to which most crops are engineered to be tolerant, has increased 
sharply over the last decade, the use of other herbicides has decreased. The 
herbicides that are applied are generally at lower strength and are less 
toxic. In addition, the herbicides can be applied to fields containing tolerant 
crops later in the season, after the crops have grown. This has dramatically 
reduced (and on many farms eliminated) soil tilling, the practice of turning 
the soil to remove weeds, and has thereby increased soil quality by reducing 
erosion and organic carbon loss and has decreased fuel consumption by 
farm machinery. The latter benefit is not trivial in its impact on the environ¬ 
ment, as the estimated reduction in emissions of carbon dioxide, a green¬ 
house gas, as a consequence of growing herbicide-tolerant soybeans and 
canola is the equivalent of removing 400,000 cars from the road for 1 year. 

Because different agricultural chemicals have different environmental 
toxicities, a better measure of their impact on the environment takes into 
consideration not only the amount of active ingredient applied, but also the 
toxicity of the chemical to farm workers, the consumer, and other organ¬ 
isms, such as birds and insects; its persistence in the soil; and its potential 
to leach into groundwater. These parameters are used to determine the 
relative environmental impact of each pesticide. Using these values, the 
global reduction in environmental impact resulting from changes in herbi¬ 
cide and pesticide applications to genetically modified crops was deter¬ 
mined to be 15.3% over a 10-year period (1996 to 2005). 

Many other examples have been presented throughout this book that 
illustrate the potential of genetic engineering to reduce damage to the envi¬ 
ronment. Pigs engineered with the phytase gene from bacteria utilize phos¬ 
phorus in feed more efficiently and thereby reduce the phosphate content 
in their feces by up to 75%. Phosphate is a major environmental pollutant 
from pork production. Plants and bacteria can be genetically engineered to 
more effectively remove toxic compounds, such as heavy metals, from con¬ 
taminated soils. Moreover, plants and animals can be engineered to more 
efficiently utilize nutrients and to grow under nonoptimal conditions. 
These features could enable cultivation of crops on less land or on land that 
would otherwise not be usable, thereby meeting the food demands of an 
increasing global population with reduced impact on resources. 


Economic Issues 

Who Benefits from Molecular Biotechnology? 

Developers of biotechnology products that have applications in health care, 
agriculture, and industry have benefited financially from recombinant 
DNA technology. Global sales of all biotechnology goods and services by 
biotechnology companies exceed $70 billion annually, with firms in the 
United States accounting for more than half of the sales. Most of these com¬ 
panies develop pharmaceuticals. For example, in the United States and 
China, sales from health products account for 87 and 77% of total sales, 
respectively. Recombinant pharmaceutical products that are highly suc¬ 
cessful include the therapeutic proteins erythropoietin and human insulin, 
which respectively have sales of more than $8.8 billion and $5.3 billion 
annually. 

Many smaller developers complain that the high cost of developing and 
commercializing a genetically engineered product prevents all but large 
multinational companies from profiting. For example, in the agricultural 
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biotechnology sector, where the cost of developing a genetically engineered 
product can exceed $20 million, due in part to stringent regulatory require¬ 
ments, four companies own or co-own about 80% of the genetically engi¬ 
neered products that have received commercial approval: DuPont (United 
States), Monsanto (United States), Bayer Cropscience (Germany), and 
Syngenta (Switzerland). Moreover, high research and development costs 
have prohibited the commercialization of products that are not marketable 
on a large, profitable scale. This means that recombinant drugs that could 
target rare diseases or diseases that are found predominantly in low-income 
populations and transgenic crops other than herbicide-tolerant and insect- 
resistant crops and those that are not widely planted do not provide the 
revenues that are necessary to justify the high cost of product development. 
Improved varieties of crops that are important for resource-poor nations, 
such as rice, cassava, and millet, have not been developed as extensively. In 
addition, product development has been limited by antibiotechnology mar¬ 
kets, such as the European Union, which in turn influence development in 
countries that rely on export to these markets. 

Farmers in both developed and developing nations have perhaps been 
the greatest beneficiaries of recombinant DNA crop technology, not only 
due to increased yields from genetically engineered crops, but also from cost 
savings due to fewer applications of agricultural chemicals and to reduced 
labor and machinery costs. Profits are generally higher even after the higher 
cost of the transgenic seed is taken into consideration. Statistics for 2005 
indicate that the global increase in farm income as a consequence of growing 
genetically modified crops was $5 billion. Most of the economic benefit 
(55%) was derived by farmers in developing countries. In particular, farmers 
in developing countries profited from cultivation of herbicide-tolerant soy¬ 
beans and insect-resistant cotton. For example, a multiyear comparison of 
the productivity and profitability of small, resource-poor farms in South 
Africa growing nontransgenic cotton or bollworm-resistant Bt cotton found 
consistent yield increases of 156 to 185% for those growing Bt cotton. 
Although the Bt seed costs were roughly twice those of the conventional 
cotton, profits from the insect-resistant cotton were at least double due to 
higher yields, reduced pesticide costs, and reduced labor costs to spray the 
pesticide. For most insect-resistant crops, the greatest economic benefits are 
reaped in seasons when insect infestation is high. Because damage from 
insect predation is lower for the Bt crops, fungal infestations, which are 
often facilitated by insect damage to plant tissues, are also reduced. Cereals, 
oilseeds, and nuts contaminated with fungi are not suitable for sale due to 
increased risk of contamination with mycotoxins, such as aflatoxin, a carci¬ 
nogenic toxin produced by Aspergillus. The economic benefits due to 
reduced mycotoxin contamination are significant at $30 million. 

How Do Views about Genetically Engineered Food Affect Trade? 

Exports of agricultural products from the United States to the European 
Union were valued at over $10 billion in 2008. However, differences in 
consumer acceptance and regulatory views have often made trade between 
the two regions difficult. In 1998, the European Union placed a moratorium 
on the importation of genetically modified com, cotton, and soybean prod¬ 
ucts. This prompted a complaint in 2003 by the United States, Canada, and 
Argentina to the World Trade Organization, which ruled in 2008 that the 
additional risk assessments and approvals required by the European Union 
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for some of the genetically engineered products were not based on scien¬ 
tific evidence and unfairly delayed trade. In contrast to the United States, 
only a few genetically engineered crops have been approved for cultivation 
and importation in the European Union, and to avoid labeling foods in 
which more than 0.9% is derived from an approved genetically engineered 
ingredient(s), exporters to the European Union have been forced to sepa¬ 
rate genetically modified crops from conventional crops, which is costly. 
Despite the view in many other countries that genetically modified foods 
are safe, Europeans are strongly opposed to the products. This is due in 
part to previous food scares, such as mad cow disease (bovine spongiform 
encephalopathy) in the 1990s, which left Europeans wary of food safety 
authorities who claimed that controversial foods were safe for human con¬ 
sumption. There is, however, concern among some farmers in the European 
Union that if new varieties of genetically engineered crops are not approved 
for importation from major feed producers in North and South America, 
where they are grown extensively, then there will be serious shortages of 
feed for livestock. 

The resistance of the European Union to transgenic products has forced 
many developing countries to forgo the benefits of planting genetically 
engineered crops for fear that trade with the European Union will be jeop¬ 
ardized. In an extreme example, thousands of tons of food aid received by 
Zambia from the United States during the famine of 2001 were rejected. 
This was because the donation was believed to contain genetically modi¬ 
fied corn. Subsequent to this, from 2002 to 2004, several other African coun¬ 
tries refused food aid containing genetically modified organisms, putting 
15 million Africans at risk for starvation. Many suspect that the decision to 
reject the aid was made to preserve trade relationships with countries that 
disfavor genetically modified food products. If donated seeds are planted, 
even inadvertently, then African crops may not be exported to markets 
such as the European Union. On the other hand, the United States has been 
accused of using food aid to introduce genetically modified food to devel¬ 
oping countries. 


SUMMARY 


W hen assessing the acceptability of products of molecular 
biotechnology, people generally want to know what the 
benefits are compared to similar products derived from a con¬ 
ventional technology, who benefits from the biotechnology 
products, and the risks associated with the products. In par¬ 
ticular, consumers are concerned about the safety of foods 
from genetically engineered microbes, plants, and animals 
and the impact of farming genetically engineered crops and 
livestock on the environment. The attitudes of consumers and 
the rigorous regulatory requirements for commercialization 
have greatly influenced the development and availability of 
biotechnology products. 

Many countries, including the United States, have adopted 
the principle of "substantial equivalence" when evaluating 
the safety of genetically engineered foods. This means that a 
genetically modified plant or animal food product must be 
similar in composition to the corresponding conventional 
food. Levels of nutrients, anti-nutrients and natural toxins 
must not be different, and animal-feeding trials must not 


show differences in the development, health, or performance 
of the animals that would indicate reduced nutrition or 
increased toxicity or allergenicity of the genetically modified 
food. If the genetically engineered product is not substantially 
different from the conventional product, then labeling it as 
genetically engineered is not required. Labeling is required if 
the food contains a potential allergen or higher levels of a 
toxin, although it is unlikely that such products would be 
commercialized due to lack of consumer acceptance. The 
European Union, in response to strong consumer opposition 
to genetically modified foods, has adopted a precautionary 
approach and requires that all foods in which more than 0.9% 
is derived from a genetically engineered organism(s) be 
labeled. This has made trade difficult among countries with 
opposing views of biotechnology. 

Cultivation of genetically modified crops or farming of 
transgenic livestock necessarily requires the release of the 
organism into the environment. Some people are concerned 
that this may increase the risk for unintended harm to natural 
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organisms in an environment or for transfer of the transgene 
to non-genetically engineered organisms, which could render 
them weedy or invasive. The vast majority of the transgenic 
crops currently grown have been engineered to resist preda¬ 
tion by specific insects or to tolerate herbicides. They have 
resulted in substantial reductions in applications of agricul¬ 
tural chemicals and have reduced the agricultural footprint on 
the environment. Because the Bt insecticidal toxins are highly 
specific for their target insects and are confined to plant tis¬ 
sues, the impact on nontarget insects is very low. While the 
transgenes have been shown to be transferred to non-geneti- 
cally engineered crops and wild relatives through cross-polli¬ 
nation, the herbicide-tolerant hybrids do not have an advantage 
in areas where the herbicides are not applied and insect- 
resistant hybrids have an advantage only when predation by 


the specific insect is high. To prevent the spread of the trans¬ 
gene, cultivation of genetically engineered crops should be 
avoided in areas where sexually compatible nontransgenic 
crops and wild varieties are found. 

Biotechnology products and services have enjoyed huge 
commercial success. Not only have the developers of the tech¬ 
nology and products reaped the financial rewards, so have 
some of the downstream users of the biotechnology products. 
In particular, farmers in both developed and developing 
nations who plant genetically modified crops have benefited 
from increased profits due to higher yields and fewer pesti¬ 
cide applications. However, opponents point out that the cost 
of biotechnology products is high, for both the developer and 
the users, and this limits the products that are available. 
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REVIEW QUESTIONS 


1. Discuss some of the reasons why foods derived from 
genetically engineered organisms might be safer for con¬ 
sumption than those developed through selective breeding. 
Discuss some of the reasons why genetically modified foods 
might be less safe. 

2. What is meant by the term "substantial equivalence," and 
how is it used to assess food derived from genetically engi¬ 
neered organisms? 

3. Why are animal-feeding trials important in the evaluation 
of the safety of genetically modified foods? 

4. What are some of the benefits of golden rice? Why has 
commercialization been delayed? 

5. How have some reports in the popular media contributed 
to the opposition to genetically engineered foods among con¬ 
sumers? 

6. What is StarLink corn, and why is it no longer marketed? 

7. When is a genetically engineered food required to be 
labeled in the United States and in the European Union? 

Why are the requirements different in the two regions? 


8. Explain why the risk of incorporation of a transgene in 
genetically modified food into the genome of the consumer 
or intestinal bacterium is low. 

9. Discuss the potential for cultivation of genetically engi¬ 
neered crops to decrease biodiversity. 

10. Why is the risk of harm to nontarget insects lower for 
insect-resistant transgenic crops than for insecticide sprays? 

11. Describe some ways in which genetically modified 
organisms can benefit the environment. 

12. How have farmers benefited economically from growing 
insect-resistant transgenic crops? 

13. How have consumer attitudes toward genetic engi¬ 
neering and stringent regulatory requirements limited the 
development of biotechnology products? How do they 
impact trade among nations? 
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Amino Acids of Proteins 
and Their Designations 


Amino acid 

Three-letter designation 

Single-letter designation 

Alanine 

Ala 

A 

Arginine 

Arg 

R 

Asparagine 

Asn 

N 

Aspartic acid 

Asp 

D 

Cysteine 

Cys 

C 

Glutamic acid 

Glu 

E 

Glutamine 

Gin 

Q 

Glycine 

Gly 

G 

Histidine 

His 

H 

Isoleucine 

lie 

1 

Leucine 

Leu 

L 

Lysine 

Lys 

K 

Methionine 

Met 

M 

Phenylalanine 

Phe 

F 

Proline 

Pro 

P 

Serine 

Ser 

S 

Threonine 

Thr 

T 

Tryptophan 

Trp 

W 

Tyrosine 

Tyr 

Y 

Valine 

Val 

V 
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Glossary 


A An adenine residue in either DNA or RNA. 

Ab See Antibody. 

Absolute linkage See Complete linkage. 

ACC 1-Aminocyclopropane-l-carboxylate. In plants, ACC is 
the immediate precursor of ethylene. 

ACC deaminase A microbial enzyme that can cleave ACC to 
ammonia and a-ketobutyrate. 

ACC oxidase A plant enzyme that catalyzes the oxidation of 
ACC to ethylene. Formerly called ethylene-forming enzyme. 

ACC synthase A plant enzyme that catalyzes the synthesis of 
ACC from S-adenosylmethionine. Its activity is stimulated by 
indoleacetic acid. 

Acetylation Addition of an acetyl group (CH.COO) to a 
protein or other molecule. 

Actin A major protein component of skeletal muscle. A con¬ 
tractile protein present within eukaryotic cells. 

Activation Enhancement of the rate of transcription. 

Activator (1) A substance or physical agent that stimulates 
transcription of a specific gene or operon. (2) A protein that 
binds to an operator and enhances the rate of transcription. 
Also called activator protein. 

Activator site A DNA sequence to which an activator protein 
binds. Also called activating site. 

Acyl carrier protein A low-molecular-weight protein that 
forms part of a larger complex for either fatty acid or 
polyketide biosynthesis. 

Acylation Addition of an acyl group (RCO“) to a molecule. 

Adaptor (1) A synthetic double-stranded oligonucleotide 
that is blunt ended at one end and at the other has a nucle¬ 
otide extension that can base pair with a cohesive end created 
by cleavage of a DNA molecule with a specific type II restric¬ 
tion endonuclease. After blunt-end ligation of the adaptor to 
the ends of a target DNA molecule, the construct can be 
cloned into a vector by using the cohesive ends of the adaptor. 
(2) A synthetic single-stranded oligonucleotide that, after 


self-hybridization, produces a molecule with cohesive ends 
and an internal restriction endonuclease site. When the 
adaptor is inserted into a cloning vector by means of the 
cohesive ends, the internal sequence provides a new restric¬ 
tion endonuclease site. 

Adenine One of the organic bases found in either DNA or 
RNA. 

Adjuvant A substance added to an immunogen (antigen) to 
increase the immunological response. 

Aerobe A microorganism that requires oxygen for growth 
(respiration). 

Affinity purification Selective isolation of a tagged molecule 
due to the specific binding of the tag (e.g., biotin) to another 
molecule (e.g., avidin or streptavidin). 

Affinity tag A short sequence of amino acids that is engi¬ 
neered as part of a recombinant protein and binds to a spe¬ 
cific element, compound, or macromolecule, which facilitates 
the identification or purification of the recombinant protein. 
Also called peptide tag, protein tag. 

Ag See Antigen. 

Airlift fermenter A cylindrical fermentation vessel in which 
the cells are mixed by air that is introduced at the base of the 
vessel and rises through the column of culture medium. The 
cell suspension circulates around the column as a conse¬ 
quence of the gradient of air bubbles in different parts of the 
reactor. 

Alginate A polysaccharide polymer produced by different 
seaweeds and bacteria that is composed of p-D-mannuronate 
and a-L-guluronate. 

Algorithm A precise procedure for solving a problem that is 
usually implemented by a computer program. Bioinformatics 
algorithms are developed to process, store, analyze, and visu¬ 
alize biological data. 

Alignment Positioning of nucleotides or amino acids in two 
or more DNA, RNA, or protein sequences to line up regions 
of the sequences that are identical or similar. 
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Alkaloid One of a group of nitrogenous organic compounds 
derived from plants and having pharmacological properties. 

Allele An alternative form of a gene. 

Allelic frequency The ratio of the occurrence of one partic¬ 
ular allele at a locus to the occurrence of all the alleles of the 
locus in a large number of individuals of a population. 

Allelochemical A biologically synthesized chemical pro¬ 
duced by one organism (usually a plant) and toxic or inhibi¬ 
tory to another. For example, many plant metabolites are 
toxic to insects or to some fungi. 

Allergen A substance that stimulates an allergic reaction in 
sensitive individuals. Antibodies (immunoglobin E) are pro¬ 
duced inappropriately by the immune system in response to 
the allergen, leading to a hypersensitivity response. 

Allogeneic Having different antigens (an attribute of cell 
types). Also called allogenic. 

Allolactose An isomer of D-lactose that is the actual inducer 
of the Escherichia coli lac operon. The enzyme p-galactosidase 
converts lactose to allolactose. 

Allosteric control See Allosteric regulation. 

Allosteric regulation A catalysis-regulating process in which 
the binding of a small effector molecule to one site on an 
enzyme affects the catalytic activity at another site on the 
enzyme. 

Alternative splicing Cell-specific removal of an exon(s) 
during processing of a primary transcript that leads to a func¬ 
tional mRNA. 

Amber mutation An alteration of the DNA resulting in the 
change of a codon specifying an amino acid to the nucleotide 
triplet TAG, which encodes UAG, a nonsense or stop codon. 

Amino acid A building block of a protein. 

Aminoacyl site The portion of a ribosome where the anti- 
codon-codon interaction takes place during translation. Also 
called an A site. 

Aminoacyl-tRNA A charged tRNA; a tRNA with its specific 
amino acid attached to its 3' end. 

1-Aminocyclopropane-l-carboxylate See ACC. 

Amplicon (1) A herpes simplex virus type 1 plasmid vector. 
Also called amplicon plasmid. (2) A specific DNA fragment 
produced by a polymerase chain reaction. 

Amylolytic Capable of breaking down starch into sugars 
(e.g., agents). 

Anaerobe A microorganism that grows (respires) in the 
absence of oxygen. 

Analytical protein microarray A high-density array of anti¬ 
bodies that captures proteins or other compounds or a pro¬ 
tein array that captures antibodies. 

Annealing The process of heating (denaturing step) and 
slowly cooling (renaturing step) double-stranded DNA to 
allow the formation of hybrid DNA or DNA-RNA mole¬ 
cules. 


Annotated database Computer-stored data that are supple¬ 
mented with additional information, such as detailed descrip¬ 
tions, comments, and references. 

Antibiosis The prevention of growth or development of an 
organism by a substance or another organism. 

Antibiotic A biological substance that is produced by one 
organism and that can inhibit the growth of, or kill, another 
organism. 

Antibody A protein (immunoglobulin) that is synthesized 
by a B lymphocyte and that recognizes a specific site on an 
antigen. The basic immunoglobulin molecule consists of two 
identical heavy and two identical light chains. 

Anticodon A set of three contiguous nucleotides in a tRNA 
molecule that are complementary to a set of three contiguous 
nucleotides (codon) in an mRNA. 

Antifreeze protein A type of protein that binds to ice crystals 
and depresses the freezing temperature of the crystal below 
its melting temperature. Antifreeze proteins have been found 
in fish, insects, plants, fungi, and bacteria and protect cells 
from being damaged by ice crystals. 

Antigen A compound that induces the production of anti¬ 
bodies. 

Antigenic determinant See Epitope. 

Anti-idiotype antibody An antibody that has the properties 
of an antigen. 

Antiparallel orientation The arrangement of the two strands 
of a duplex DNA molecule, which are oriented in opposite 
directions, so that the 5' phosphate end of one strand is 
aligned with the 3' hydroxyl end of the complementary 
strand. 

Antisense DNA (1) The sequence of chromosomal DNA that 
is transcribed. (2) A DNA sequence that is complementary to 
all or part of a functional RNA (mRNA). 

Antisense RNA An RNA sequence that is complementary to 
all or part of a functional RNA. 

Antisense therapy The in vivo treatment of a genetic disease 
by blocking translation of a protein with a DNA or an RNA 
sequence that is complementary to a specific mRNA. 

Antiserum The fluid portion of the blood that contains the 
antibodies of an immunized organism. 

Aphid A plant-sucking insect of the family Aphididae. 

Apoplast The set of intercellular spaces that are inside a 
plant but outside the plant cells. 

Apoptosis A controlled process leading to the death of the 
cell that occurs normally during the development of a multi¬ 
cellular organism or in response to cell damage or infection. 
Also known as programmed cell death. 

Aptamer A synthetic nucleic acid, typically 15 to 40 nucle¬ 
otides long, that has highly organized secondary and tertiary 
structures and binds with high affinity to a protein that nor¬ 
mally does not bind to a nucleic acid. 
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Aquaculture Farming of fish or other marine or freshwater 
organisms under controlled conditions to produce food. 

Arabidopsis thaliana A plant with a very small genome that 
is used as a model organism for the study of plant growth 
and development. 

ARS See Autonomous replicating sequence. 

Arthropod An invertebrate animal, such as an insect, spider, 
or crustacean, that has jointed limbs and an exoskeleton. 

Articular cartilage Tough, rubbery, translucent, elastic tissue 
that forms the surfaces of bones within joints. 

Atelocollagen Collagen from calf dermis digested under 
acidic conditions with the proteolytic enzyme pepsin to form 
positively charged subunits of approximately 300 kilodaltons 
each. 

Atherogenic Causing the formation of lipid-containing 
plaques on the innermost layer of the wall of an artery. 

Attenuated vaccine A virulent organism that has been modi¬ 
fied to produce a less virulent form but nevertheless retains 
the ability to elicit antibodies against the virulent form. 

Authentic protein A recombinant protein that has all the 
properties, including any posttranslational modifications, of 
its naturally occurring counterpart. 

Autoantibody An antibody against one of one's own pro¬ 
teins. 

Autologous cells Cells that are taken from an individual, 
cultured (or stored), and possibly genetically manipulated 
before being infused back into the original donor. 

Autonomous replicating sequence Any cloned DNA 
sequence that initiates and supports extrachromosomal repli¬ 
cation of a DNA molecule in a host cell; often used in yeast 
cells. Also called autonomously replicating sequence, 
autonomous(ly) replicating segment. 

Autoradiography A technique that captures the image 
formed in a photographic emulsion as a result of the emission 
of either light or radioactivity from a labeled component that 
is placed next to unexposed film. 

Autosomal Encoded on chromosomes other than the sex 
chromosomes. 

Auxin See IAA. 

Auxotroph A mutant that is unable to synthesize an essential 
metabolite, and therefore, the metabolite must be provided as 
a nutritional supplement for growth. 

Avidin A glycoprotein component of egg white that binds 
strongly to biotin. 

B cells Lymphocytes that produce antibodies and that are 
derived from bone marrow cells. 

BAC See Bacterial artificial chromosome. 


Bacmid A shuttle vector based on the Autographa califomica 
multiple nucleopolyhedrosis virus genome that can be propa¬ 
gated in both Escherichia coli and insect cells. 

Bacterial artificial chromosome A vector system based on 
the Escherichia coli F factor plasmid that is used for cloning 
large (100- to 300-kb) DNA inserts. Abbreviated BAC. 

Bacteriocin A compound produced by one bacterium that 
can kill cells of another bacterial species. 

Bacteriophage A virus that infects bacteria. Also called 
phage. 

Bacteroid A modified bacterial cell formed following infec¬ 
tion of a legume root hair and subsequent formation of a root 
nodule by a rhizobial strain when the bacteria inside the 
nodule shed their cell walls, thereby facilitating exchanges of 
nutrients between the bacteria and the plant. 

Baculovirus A virus that infects insects. 

Bank See Gene bank. 

Base pair A term representing complementary nucleotides; in 
DNA, adenine (A) is hydrogen bonded with the base thymine 
(T), and guanine (G) is hydrogen bonded with cytosine (C). A 
thousand base pairs is often called a kilobase pair (kb). 

Base pair substitution Permanent replacement in chromo¬ 
somal DNA of a nucleotide pair with another nucleotide 
pair. 

Batch culture See Batch fermentation. 

Batch fermentation A process in which cells or microorgan¬ 
isms are grown for a limited time. At the beginning of the 
fermentation, an inoculum is introduced into fresh medium, 
and no medium is added or removed for the duration of the 
process. 

(3-1,3-Glucanase A plant enzyme, produced in response to 
infection by fungal pathogens, that hydrolyzes some compo¬ 
nents of fungal cell walls. Some bacteria can produce (3-1,3- 
glucanase. 

p-Ketoreductase An enzyme involved in the synthesis of 
polyketide antibiotics. 

Betaine A low-molecular-weight compound that acts as a 
methyl group donor for methionine biosynthesis. 

Bifunctional vector See Shuttle vector. 

Binary fission Asexual cell division that produces equal-size 
daughter cells. 

Binary vector system Atwo-plasmid system in Agrobacterium 
spp. for transferring a T-DNA region that carries cloned 
genes into plant cells. The virulence genes are on one plasmid, 
and the engineered T-DNA region is on the other plasmid. 

Bioaccumulation Concentration of a chemical agent (e.g., 
DDT) in increasing quantities in the organisms of a food 
chain. 

Biocontrol Any process using living organisms to restrain 
the growth and development of pathogenic organisms. 
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Biodegradation The breakdown of a compound into its 
chemical constituents by living organisms. 

Bioinformatics Research into and development and applica¬ 
tion of computational tools to acquire, store, organize, ana¬ 
lyze, and visualize data for biological, medical, behavioral, 
and health sciences. 

Biolistics Delivery of DNA to plant and animal cells and 
organelles by means of DNA-coated pellets that are fired 
under pressure at high speed. Also called microprojectile 
bombardment. 

Biological aging See Senescence. 

Bioluminescence The production of light by biological 
organisms, such as insects and bacteria; usually catalyzed by 
the enzyme luciferase. 

Biomarker A biological feature that is used to measure either 
the progress of a disease or the effect of a treatment. 

Biomass (1) The cell mass produced by a population of 
living organisms. (2) The organic mass that can be used either 
as a source of energy or for its chemical components. 

Biomass concentration The amount of biological material in 
a specific volume. 

Biopolymer Any large polymeric molecule (protein, nucleic 
acid, polysaccharide, or lipid) produced by a living 
organism. 

Bioreactor A vessel in which cells, cell extracts, or enzymes 
carry out a biological reaction. Often refers to a growth 
chamber (fermenter or fermentation vessel) for cells or micro¬ 
organisms. 

Bioremediation A process that uses living organisms to 
remove contaminants, pollutants, or unwanted substances 
from soil or water. 

Biosensor A biological molecule or organism that is able to 
detect a particular molecule in the environment. 

Biotechnology The application of scientific and engineering 
principles to the processing of material by biological agents to 
provide goods and services. 

Biotin A B vitamin that is often used in biological research as 
a molecular tag because it has very high affinity for the pro¬ 
teins streptavidin and avidin. 

Biotin labeling (1) The attachment of biotin to another mol¬ 
ecule. (2) The incorporation of a biotin-containing nucleotide 
into a DNA molecule. 

Biotransformation Conversion of a substance into a product 
by an organism or an enzyme. 

BLAST Basic local alignment search tool. A computer pro¬ 
gram for determining a match between a query sequence and 
a sequence(s) in a database. BLASTn compares a DNA query 
sequence to a DNA database, and BLASTp compares an 
amino acid query sequence to a protein database. 

Blastocyst A structure formed early in mammalian embry¬ 
onic development that consists of a sphere of cells, which will 


later form the placenta, surrounding a fluid-filled cavity and 
a cluster of cells that will become the embryo. 

Blot See Northern blotting; Southern blotting. 

Blotting Transfer of a macromolecule by capillary action 
from a gel to a membrane. 

Blunt end The end of a DNA duplex molecule in which nei¬ 
ther strand extends beyond the other. Also called flush end. 

Blunt-end cut To cleave phosphodiester bonds in the back¬ 
bone of duplex DNA between the corresponding nucleotide 
pairs on opposite strands. This cleavage process produces no 
nucleotide extensions on either strand. Also called flush-end 
cut. 

Blunt-end ligation Joining (ligation) of the nucleotides that 
are at the ends of two DNA duplex molecules, neither of 
which has an extension. 

Boll weevil An insect pest of cotton plants. 

Bovine spongiform encephalopathy An infectious disease 
of the central nervous system of cattle caused by prions that 
is characterized by lesions in the brain tissues. Also known as 
mad cow disease or BSE. 

Box A short DNA sequence that plays a role in regulating, 
facilitating, enhancing, or silencing transcription. 

bp See Base pair. 

Brewer's yeast Strains of yeast, often Saccharomyces cerevisiae, 
that are used in the production of beer. 

Broad-host-range plasmid A plasmid that can replicate in a 
number of different bacterial species. 

Brush border membrane The microvillus-covered surface of 
pseudostratified and simple columnar epithelium cells. 

Bubble column fermenter A fermentation vessel, or biore¬ 
actor, in which the cells or microorganisms are kept sus¬ 
pended in a tall cylinder by rising air that is introduced at the 
base of the column. 

Bystander effect The death of an unmodified cell caused by 
a cytotoxic metabolite that is produced by a genetically 
modified cell and acquired by cell-to-cell contact. 


C A cytosine residue in either DNA or RNA. 

C terminus The last amino acid of a protein. Sometimes 
denotes the final amino acids of a protein. Also called 
carboxy(l) terminus, carboxy(l)-terminal end. 

Calibrator lane The lane of a gel that contains electropho- 
retically separated size markers. 

Callus A mass of undifferentiated plant tissue, from tissue 
cuttings or from individual plant cells, grown in culture on 
defined medium. 

cAMP See Cyclic AMR 

Cancer Uncontrolled growth of the cells of a tissue or an 
organ in a multicellular organism. 
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Candidate gene A coding sequence that has some character¬ 
istics that make it likely that it could be responsible for a 
specific genetic disorder. 

Candidate gene cloning A strategy for isolating a disease 
gene that is based on an informed guess about the possible 
gene product. 

Canola A plant whose seed is used to produce high-quality 
cooking oil. Formerly called rapeseed. Canola is Canada's 
most economically important crop. 

Cap See G cap. 

Caprolactam An organic compound (C 6 H n NO) that is a 
lactam of 6-aminohexanoic acid. The primary industrial use 
of caprolactam is as a monomer in the production of nylon. 

Capsid A structure that is composed of the coat protein(s) of 
a virus and is external to the viral nucleic acids. The capsid 
often determines the shape of the virus. 

Carrier In genetics, an individual who has one mutant allele 
and one normal allele of a gene and whose phenotype is 
normal. 

Cassette A combination of DNA elements that performs a 
specific function and that is maintained as a clonable unit. 

Catalase An enzyme that catalyzes the decomposition of 
hydrogen peroxide into water and oxygen. 

CD molecules Designation for surface molecules on various 
cells of the immune system, e.g., CD4 is present on the sur¬ 
faces of helper T cells. 

cDNA Complementary DNA. A double-stranded DNA com¬ 
plement of an mRNA sequence; synthesized in vitro by 
reverse transcriptase and DNA polymerase. 

cDNA clone A double-stranded DNA molecule that is car¬ 
ried in a vector and that was synthesized in vitro from an 
mRNA sequence by using reverse transcriptase and DNA 
polymerase. 

cDNA library A collection of cDNA clones that were gener¬ 
ated in vitro from the mRNA sequences of a single tissue or 
cell population. 

CDR See Complementarity-determining region. 

Cecropin A A 35-amino-acid peptide with antimicrobial 
activity from the giant silk moth, Hyalophora cecropia. 

Cell line A cell lineage that can be maintained in culture. 
Cell-free protein synthesis See In vitro translation. 

Cell-mediated immune response The activation of T cells of 
the immune system in response to the presence of a foreign 
antigen. 

Cellulose A high-molecular-weight polysaccharide of 
unbranched chains of (l,4)-linked p-D-glucose units that, as 
part of lignocellulose, contributes to the structural frame¬ 
work of plant cell walls. 

Cellulosome A multiprotein aggregate that is present in 
some cellulolytic microorganisms and that contains multiple 
copies of all the enzymes required to completely break down 


cellulose. This complex is often found on the outer surfaces of 
cellulolytic microorganisms. 

Centromere The part of the chromosome that attaches to the 
spindle during cell division. 

Cephem-type antibiotic An antibiotic that shares the basic 
chemical structure of cephalosporin. 

Chagas disease A parasitic disease caused by the protozoan 
Trypanosoma cruzi. 

Chaperone A protein complex that aids in the correct folding 
of nascent or misfolded proteins. 

Charged tRNA A transfer RNA molecule that is coupled to 
its specific amino acid. Also called aminoacylated tRNA, 
aminoacyl tRNA. 

Chemiluminescence The emission of light from a chemical 
reaction. 

Chimera Usually, a plant or animal that has populations of 
cells with different genotypes. Sometimes it refers to a recom¬ 
binant DNA molecule that contains sequences from different 
organisms. 

Chimeric protein See Fusion protein. 

Chitinase An enzyme that hydrolyzes the chitin components 
in fungal cell walls and insect exoskeletons. Produced by 
plants in response to infection by fungal pathogens. Some 
bacteria can produce chitinase. 

Cholesterol A sterol that occurs widely in animal tissues. A 
precursor of various steroids. Often found in membranes. 

Chromatin DNA and associated proteins that are compacted 
into chromosomes. 

Chromatin immunoprecipitation A technique that uses spe¬ 
cific antibodies to identify regions of a genome that are 
bound by a particular DNA-binding protein of interest in 
living cells. 

Chromogenic substrate A compound or substance that con¬ 
tains a color-forming group. 

Chromosomal integration site A chromosomal location 
where foreign DNA can be integrated, often without impairing 
any essential function in the host organism. 

Chromosome A physically distinct unit of the genome. 

Cistron A sequence of DNA that encodes a polypeptide 
chain. 

Claims A section of a patent that states, in detail, the uses 
and possible applications of the invention described in the 
patent. 

Cleave To break phosphodiester bonds of duplex DNA, usu¬ 
ally with a type II restriction endonuclease. Also called cut, 
digest. 

Clone (1) A population of cells or organisms that are geneti¬ 
cally identical as a result of asexual reproduction, breeding of 
purebred (isogenic) organisms, or forming genetically iden¬ 
tical organisms by nuclear transplantation. (2) A population 
of cells that all carry a cloning vehicle with the same insert 
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DNA molecule. (3) To insert a DNA segment into a vector or 
host chromosome. 

Clone bank See Gene bank. 

Cloning Incorporating a DNA molecule into a chromosomal 
site or a cloning vector. 

Cloning site A location on a cloning vector into which DNA 
can be inserted. 

Cloning vector A DNA molecule that can carry inserted 
DNA and can be perpetuated in a host cell. Also called 
cloning vehicle, vector, vehicle. 

Cloning vehicle See Cloning vector. 

Coding triplet A set of three contiguous nucleotides of the 
nontranscribed DNA strand of the coding region of a struc¬ 
tural gene that is complementary to a transcribed triplet. 

Codon A set of 3 nucleotides in mRNA that specifies a tRNA 
carrying a specific amino acid that is incorporated into a 
polypeptide chain during protein synthesis. 

Codon optimization An experimental strategy in which 
codons within a cloned gene that are not the ones generally 
used by the host cell translation system are changed to the 
preferred codons without changing the amino acids of the 
synthesized protein. 

Codon usage The mean frequency of occurrence of each 
codon determined from a large sample of structural genes of 
an organism. 

Cofactor A low-molecular-weight compound that is a 
required component in an enzymatic reaction. 

Cofermentation The simultaneous growth of two microor¬ 
ganisms in one bioreactor. 

Cohesive ends Complementary single-strand extensions on 
the ends of duplex DNA molecules. Also called sticky ends. 
See also cos ends. 

Cointegrate vector system A two-plasmid system for trans¬ 
ferring cloned genes to plant cells. The cloning vector has a 
T-DNA segment that contains cloned genes. After introduc¬ 
tion into Agrobacterium tumefaciens, the cloning vector DNA 
undergoes homologous recombination with a resident dis¬ 
armed Ti plasmid to form a single plasmid carrying the 
genetic information for transferring the genetically engi¬ 
neered T-DNA region to plant cells. 

Coleoptile A protective organ that covers the youngest 
leaves of a plant. 

Collagen An insoluble fibrous protein commonly found in 
connective tissue. Collagen accounts for over 30% of the total 
protein in mammals. 

Colon The large intestine. 

Colony hybridization A technique that uses a nucleic acid 
probe to identify a bacterial colony with a vector carrying a 
specific cloned gene(s). 

Combinatorial library During the ligation reaction with 
cDNAs of light and heavy antibody chains into a bacterio¬ 


phage k or M13 vector, many novel combinations consisting 
of one heavy and one light chain coding region are formed. 
The library comprises these combinations, each in a separate 
vector. 

Competence The ability of bacterial cells to take up (usually 
plasmid) DNA molecules. 

Complement A group of serum proteins that is activated by 
an antibody-antigen complex and has enzymatic and other 
biological activities, such as degrading antibody-antigen 
complexes, lysing cells, modulating antibody production, 
stimulating immune cells to migrate to a site of complement 
activity, and inducing the release of histamine. 

Complement cascade The series of sequential activations 
and enzymatic reactions by serum proteins that is launched 
in response to the formation of an antibody-antigen com¬ 
plex. 

Complementarity (1) A condition in which one of a pair of 
nucleotide bases forms hydrogen bonds with each other. 
Adenine (A) pairs with thymine (T) (or with uracil [U] in 
RNA), and guanine (G) pairs with cytosine (C). (2) A condi¬ 
tion in which one of a pair of segments or strands of nucleic 
acid hybridizes (joins by hydrogen bonding) with the 
other. 

Complementarity-determining region A part of the vari¬ 
able (V) regions of light and heavy antibody chains that 
makes contact with the antigen. The amino acid sequences of 
complementarity-determining regions are highly variable 
from one antibody of the same class to another. Abbreviated 
CDR. 

Complementary base pairs Pairs of nucleotide bases that 
form hydrogen bonds with each other. In double-stranded 
DNA, adenine forms hydrogen bonds with thymine, and 
cytosine forms hydrogen bonds with guanine. In double- 
stranded regions of RNA molecules, and in both RNA-RNA 
and DNA-RNA strand interactions, adenine forms hydrogen 
bonds with uracil, and cytosine forms hydrogen bonds with 
guanine. 

Complementary DNA See cDNA. 

Complementary homopolymeric tailing The process of 
adding complementary nucleotide extensions to different 
DNA molecules, e.g., to the 3' hydroxyl ends of one DNA 
molecule in dG (deoxyguanosine) and to the 3' hydroxyl ends 
of another DNA molecule in dC (deoxycytidine), to facilitate 
the joining, after they are mixed, of the two DNA molecules 
by base pairing between the complementary extensions. Also 
called dG-dC tailing, dA-dT tailing. 

Complementation See Genetic complementation. 

Complete linkage Two or more adjacent gene loci on the 
same chromosome that are always inherited together. Also 
called absolute linkage. 

Complete penetrance A situation in which all the individ¬ 
uals with a mutant allele(s) at a gene locus show the same 
abnormal (mutant) phenotype. 


GLOSSARY 949 


Computational biology Development and application of 
data analysis, modeling, and simulation techniques to study 
biological, behavioral, and social systems. 

Concatemer A tandem array of repeating unit-length DNA 
elements. 

Conformation The shape of a molecule or any other object. 

Conifer An evergreen or softwood tree, such as pine, fir, or 
spruce. 

Conjugation The unidirectional transfer of DNA from one 
bacterium to another, involving cell-to-cell contact. 

Conjugative functions Plasmid-based genes and their prod¬ 
ucts that facilitate the transfer of a plasmid from one bacte¬ 
rium to another. 

Constant domains Regions of antibody chains that have the 
same amino acid sequence in different members of a partic¬ 
ular class of antibody molecules. 

Constitutive synthesis Continual production of RNA or 
protein by an organism. 

Contig A set of overlapping contiguous clones that cover a 
chromosome region or a whole chromosome. 

Continuous fermentation A process in which cells or micro¬ 
organisms are maintained in culture in the exponential 
growth phase by the continuous addition of fresh medium 
that is exactly balanced by the removal of cell suspension 
from the bioreactor. 

Copy number The average number of a specific type of 
plasmid molecules in a cell. 

Corepressor A low-molecular-weight compound that com¬ 
bines with an inactive repressor protein to form a complex 
that binds to an operator region and prevents transcription. 

cos ends The 12-base, single-strand, complementary exten¬ 
sions of bacteriophage X DNA. Also called cos sites. 

Cosegregation Two genetic conditions appearing to be inher¬ 
ited together. 

Cosmid A vector that uses the cos end sequences of bacterio¬ 
phage X and in vitro bacteriophage packaging to form, after 
injection of the vector into a host cell, a plasmid that can carry 
as much as 45 kb of insert DNA. 

Cosuppression The transformation of a plant with a gene, in 
the sense orientation, that the plant already possesses. This 
results in the downregulation of both the endogenous and 
introduced genes. Also called sense suppression. 

Cotransfection The introduction of two different DNA mol¬ 
ecules into a eukaryotic cell. In baculovirus expression sys¬ 
tems, the procedure by which the baculovirus and the 
transfer vector are simultaneously introduced into insect cells 
in culture. 

Coupling The phase state in which either two dominant or 
two recessive versions of two different genes occur on the 
same chromosome. Also called cis configuration. See also 
Repulsion. 


CpG islands Clusters of GC-rich regions that precede many 
transcribed vertebrate genes. 

Cross In genetic studies, the mating of two individuals. Also 
called mating. 

Crossbreeding The mating of two different species or vari¬ 
eties to form a hybrid. Also known as hybridization. 

Crossover (1) The site of recombination. A single crossover 
represents one reciprocal breakage-and-reunion event. A 
double crossover requires two simultaneous reciprocal 
breakage-and-reunion events. (2) The reciprocal exchange of 
DNA between two chromosomes or DNA molecules by a 
breakage-and-reunion process. Also called recombination, 
recombination event. 

Crown gall A bulbous growth that occurs at the bases of 
certain plants and that is due to infection of the plant by a 
member of the bacterial genus Agrobacterium. Also called 
crown gall tumor. 

Crown gall tumor See Crown gall. 

Crucifer A plant, such as cabbage, that is a member of the 
family Cruciferae. 

Cryptic site A functional macromolecular sequence in an 
unlikely location. Also used, in some instances, to denote a 
macromolecular sequence whose function is unknown. 

Cultivar A variety of plant that is (1) below the level of a 
subspecies taxonomically and (2) found only under cultiva¬ 
tion. 

Culture A population of cells or microorganisms that are 
grown under controlled conditions. 

Culture medium A solid or liquid mixture that is used to 
grow microorganisms, organisms, or cells. 

Cut See Cleave. 

Cyclic AMP Adenosine 3',5'-cyclic phosphoric acid, an 
important regulatory molecule. Also called cAMP. 

Cyclic-array sequencing A large-scale method for simulta¬ 
neously determining the nucleotide sequences of millions of 
DNA fragments immobilized in a dense array. 

Cytokine Any of several regulatory proteins, such as the 
interleukins and lymphokines, that are released by cells of the 
immune system and that act as intercellular mediators in the 
generation of an immune response. 

Cytokinin A plant hormone that stimulates cell division. 

Cytoplasm The contents of a cell enclosed by the cytoplasmic 
membrane (outside of the nucleus of a eukaryotic cell). 

Cytosine One of the organic bases found in either DNA or 
RNA. 

Cytosol The semifluid soluble portion of the cytoplasm of 
cells. 


dA-dT tailing See Complementary homopolymeric tailing. 
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Damping off A disease of plant roots, generally caused by a 
pathogenic fungus. 

Database A file system of formatted information that can be 
readily accessed, retrieved, and updated. 

Dehalogenation The removal of halogen atoms (chlorine, 
iodine, bromine, or fluorine) from molecules, usually during 
biodegradation. 

Deletion Loss of an internal portion of chromosomal DNA. 

Denaturation (1) Separation of duplex nucleic acid mole¬ 
cules into single strands. (2) Disruption of the conformation 
of a macromolecule without breaking covalent bonds. 

Denatured DNA Duplex DNA that has been converted to 
single strands by breaking the hydrogen bonds of comple¬ 
mentary nucleotide pairs. 

Deoxyribonuclease I See DNase I. 

Deoxyribonucleic acid See DNA. 

Deoxyribose The five-carbon sugar component of DNA. 
Deoxyribozyme A DNA molecule that has catalytic activity. 

Derepression Displacement of a repressor protein from a 
promoter or operator region of DNA; the "turning on" of a 
gene. When attached to the DNA, the repressor protein pre¬ 
vents RNA polymerase from initiating transcription. 

Dermis The thicker, outermost layer of skin in vertebrates. 
dG-dC tailing See Complementary homopolymeric tailing. 
Diagnosis Determination of the cause of a disorder. 

Diagnostic procedure A test or assay used to determine the 
presence of an organism, substance, or nucleic acid sequence 
alteration that represents a disease or pathogenic condition. 

Diaminopimelic acid The immediate precursor to L-lysine 
in bacteria and plants and a component of some bacterial cell 
walls. 

Diazotroph An organism that can fix nitrogen. 

Dicistronic vector A mammalian cloning vector that is spe¬ 
cially designed to carry two functional genes. 

Dicotyledon A class of plants that has two seed leaves. Also 
called dicot. 

Dideoxynucleotide A nucleoside triphosphate that lacks 
hydroxyl groups on both the 2' and 3' carbons of the pentose 
sugar. Also called ddNTP. 

Digest See Cleave. 

Dihydrofolate reductase An enzyme that catalyzes the for¬ 
mation of tetrahydrofolic acid. 

Diploid A cell or organism that has a set of all pairs of its 
chromosomes. 

Directed mutagenesis The process of generating nucleotide 
changes in cloned genes by any one of several procedures, 
including site-specific and random mutagenesis. Also called 
in vitro mutagenesis. 


Disarm To delete from a plasmid or virus those genes that 
are cytotoxic or that induce crown gall formation. 

Dithiothreitol A low-molecular-weight thiol-containing 
reducing agent. It is added to buffers in low concentrations to 
prevent protein sulfhydryl groups from being oxidized. At 
higher concentrations, it is used to reduce disulfide linkages 
in proteins. 

DNA Deoxyribonucleic acid; the genetic material of living 
things. 

DNA codon A set of three contiguous deoxyribonucleotide 
pairs of the coding region of a structural gene where the bases 
of one strand are transcribed into a codon. 

DNA construct A cloning vector with a DNA insert. 

DNA delivery system A generic term for any procedure that 
facilitates the uptake of DNA by a recipient cell. 

DNA fingerprint A set of DNA fragments that are character¬ 
istic of a particular source of DNA, such as an insert of a 
clone. In some cases, restriction endonuclease DNA frag¬ 
ments are visualized by hybridization after gel electropho¬ 
resis. In other instances, the polymerase chain reaction (PCR) 
is used to generate a distinctive pattern of DNA bands that 
are evident after gel electrophoresis. 

DNA fingerprinting A comparative diagnostic technique 
that characterizes the DNA of an organism or a sample. 

DNA hybridization The pairing of two DNA molecules, 
often from different sources, by hydrogen bonding between 
complementary nucleotides. This technique is frequently 
used to detect the presence of a specific nucleotide sequence 
in a DNA sample. 

DNA microarray An array of thousands of gene sequences 
or oligonucleotide probes bound to a solid support. Also 
called DNA chip, gene array. 

DNA polymerase An enzyme that links an incoming deoxy¬ 
ribonucleotide, which is determined by complementarity to a 
deoxyribonucleotide in a template DNA strand, with a phos- 
phodiester bond to the 3' hydroxyl group of the last incorpo¬ 
rated nucleotide of the growing strand during replication. 

DNA probe A segment of DNA that is labeled (tagged) so 
that, after a DNA hybridization reaction, any base pairing 
between the probe and a complementary base sequence in a 
DNA sample can be detected. 

DNA transformation See Transfection; Transformation. 

DNA typing See Genotyping. 

DNase I An enzyme that degrades DNA. It is used to remove 
DNA from RNA preparations and from cell-free extracts. 
Also called deoxyribonuclease I. 

Domain A segment of a protein that has a discrete function 
or conformation. At the protein level, a domain can be as 
small as a few amino acid residues and as large as half of the 
entire protein. 

Dominant (1) When heterozygous and homozygous geno¬ 
types determine the same phenotype, the gene is said to be 
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dominant. (2) An allele that produces the same phenotype 
whether the genotype is heterozygous or homozygous. 

Dominant gene One of a pair of alleles that is sufficient to 
produce a phenotype in a heterozygote. 

Dominant marker selection Selection in which a gene 
encodes a product that enables only the cells that carry the 
gene to grow under certain conditions. For example, plant 
and animal cells that express the introduced Neo r gene are 
resistant to the compound G-418, and cells that do not carry 
the Neo r gene are killed by G-418. Also called positive selec¬ 
tion, positive selectable marker. 

Double crossover Two simultaneous reciprocal breakage- 
and-reunion events between two DNA molecules. 

Double heterozygote Two different gene loci, each with two 
different alleles. 

Doubling time See Generation time. 

Downstream (1) In molecular biology, the stretch of nucle¬ 
otides of DNA that lie in the 3' direction from the site of ini¬ 
tiation of transcription, which is designated +1. Downstream 
nucleotides are marked with plus signs, e.g., +2 and +10. The 
term also refers to the 3' side of a particular gene or sequence 
of nucleotides. (2) In chemical engineering, those phases of a 
manufacturing process that follow the biotransformation 
stage. Refers to recovery and purification of the product of a 
fermentation process. Also called downstream processing. 

Downstream processing See Downstream. 

Drug See Therapeutic agent. 

Dry weight The weight of a sample of biological material 
that has been dried in an oven to remove the water. 

Duplex DNA Double-stranded DNA. 

E value Expectation value for a BLAST analysis; the lower 
the E value, the more significant the alignment score. The 
number of alternate alignments with a score equal to or better 
than the similarity score for a given alignment that can be 
expected to occur simply by chance. 

Effector A low-molecular-weight compound that modifies 
the function of a regulatory protein. 

Effector cells Cells of the immune system that degrade anti¬ 
gens. 

Electrophoresis A technique that separates molecules (often 
DNA, RNA, or protein) on the basis of relative migration in a 
strong electric field. 

Electroporation Electrical treatment of cells that induces 
transient pores, through which DNA is taken into the cell. 

Electrotransfer Transfer of a macromolecule by an electric 
field from a gel to a membrane. 

ELISA See Enzyme-linked immunosorbent assay. 

Elongation Sequential addition of one monomer at a time to 
a polymer. 


Embryonic stem cells Cells of an early embryo that can give 
rise to all differentiated cells, including germ line cells. 

Emulsion PCR A technique that uses the polymerase chain 
reaction to produce tens of thousands of copies of a DNA tem¬ 
plate that is bound to a bead within a water-in-oil emulsion. 

Encode To specify, after decoding by transcription and trans¬ 
lation, the sequence of amino acids in a protein. 

End-product inhibition The inhibition of the activity of an 
enzyme by a metabolite. The enzyme is often the first enzyme 
in a biosynthetic pathway, and the metabolite is generally the 
product of the last step in the pathway. 

Endemic With respect to disease, prevalent in a particular 
geographic location. 

Endocytosis Entrance of foreign material into the cell without 
passing through the cell membrane. The membrane folds 
around the material, resulting in the formation of a saclike 
vesicle into which the material is incorporated. 

Endoplasmic reticulum A membrane-enclosed structure in a 
eukaryotic cell that is continuous with the nucleus and whose 
functions include the synthesis, processing, and transport of 
proteins (rough endoplasmic reticulum) and the synthesis of 
lipids (smooth endoplasmic reticulum). 

Endoprotease An enzyme that cleaves the peptide bonds 
between amino acids within a protein. Cleavage is usually at 
one or more specific sites. 

Endosperm A nutritive tissue in flowering plants that sur¬ 
rounds the developing embryo in a seed. 

Endothelial cell A platelike cell that lines the inner surfaces 
of blood and lymph vessels. 

Endotoxin A component of the cell wall of gram-negative 
bacteria that elicits an inflammatory response and fever in 
humans. 

Enhancer A DNA sequence that increases the transcription 
of a eukaryotic gene when they are both on the same DNA 
molecule. Also called enhancer element, enhancer sequence. 

Enolase An enzyme that catalyzes the conversion of 2-phos- 
phoglycerate to phosphoenolpyruvate. 

Enoylreductase An enzyme involved in the synthesis of 
polyketide antibiotics. 

Enterobacter cloacae A free-living gram-negative soil bacte¬ 
rium that can act as a plant growth-promoting bacterium. 

Enterotoxin A bacterial protein that, after its release into the 
intestine, causes cramps, diarrhea, and nausea. 

Enzyme A protein or complex of proteins that acts as a cata¬ 
lyst to increase the rate of a chemical reaction. 

Enzyme-linked immunosorbent assay A technique for 
detecting specific molecules in a mixed sample. An antibody 
(primary) is bound to the target molecule; another antibody 
(secondary), which binds to the primary antibody is added 
later. The secondary antibody has attached to it an enzyme 
that can convert a colorless substrate into a colored product. 
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If the target molecule is not present in the sample, washing 
steps will remove both antibodies, and no colored product 
will be produced. Also called ELISA. 

Enzyme replacement therapy Treatment of an inherited 
metabolic defect with a protein that facilitates a specific 
chemical reaction. 

Epidermal growth factor A protein with kinase (phosphory¬ 
lation) activity that is involved in triggering cell division in 
animal cells. 

Epidermis The outermost layer of cells of an animal. 

Epigenetic reprogramming Changes in epigenetic modifica¬ 
tions during normal germ cell and embryonic development. 

Epithelial cells Cells covering a surface or lining a cavity, 
e.g., gut epithelial cells. 

Epitope A specific chemical domain on an antigen that is 
recognized by an antibody. Each epitope on a molecule, such 
as a protein, elicits the synthesis of a different antibody. Also 
called antigenic determinant. 

Epitope tag An affinity tag that is recognized by an antibody. 

Error-prone PCR Use of the polymerase chain reaction 
under conditions that promote the insertion of an incorrect 
nucleotide at every few hundred or so nucleotides of the 
template. Used as a method of random mutagenesis. 

ES cells See Embryonic stem cells. 

EST See Expressed sequence tag. 

Established cell line A population of cells that is grown in 
vitro and that can be subcultured indefinitely. 

Estrogenic compounds Natural and synthetic chemicals that 
act in a manner similar to that of the steroid hormone 
estrogen, which is responsible for the development of female 
sex characteristics. 

Ethylene A gaseous compound that acts as a plant hormone. 
It is important in fruit ripening, flower senescence, seed ger¬ 
mination, rooting of cuttings, root elongation, and the 
response of the plant to environmental stress. 

Euchromatin The form of chromatin that is less compacted 
and is often more actively transcribed. 

Eukaryotes Organisms, including animals, plants, fungi, 
and some algae, that have (1) chromosomes enclosed within 
a membrane-bounded nucleus and (2) functional organelles, 
such as mitochondria and chloroplasts, in the cytoplasm of 
their cells. 

Excision The natural or in vitro enzymatic release (removal) 
of a DNA segment from a chromosome or cloning vector. 

Excrete See Export. 

Exogenous Derived externally, or foreign. 

Exogenous DNA DNA that has been derived from a source 
organism and has been cloned into a vector and introduced 
into a host cell. Also referred to as foreign or heterologous 
DNA. 


Exon A segment of a gene that is transcribed as part of the 
primary transcript and is retained, after being processed, 
with other exons to form a functional mRNA molecule. 

Exonuclease III An Escherichia coli enzyme that removes 
nucleotides from the 3' hydroxyl ends of double-stranded 
DNA. Also called ExoIII, exodeoxyribonuclease III. 

Exopolysaccharide A high-molecular-weight polymer that is 
composed of sugar residues and is secreted by a microor¬ 
ganism into the surrounding environment. 

ExoIII See Exonuclease III. 

Export To transport a protein out of a cell. Also to secrete or 
to excrete. 

Expressed sequence tag A partially sequenced cDNA clone 
for which a PCR assay exists. Also called EST, expressed- 
sequence-tagged site, eSTS. 

Expression Transcription and translation of a gene. 

Expression library A population of different DNA molecules 
cloned into an expression vector. 

Expression profile Determination of the members of a tran- 
scriptome or proteome in a cell, tissue, or organism. 

Extension A single-stranded DNA region consisting of one 
or more nucleotides at the end of a strand of duplex DNA. 
Also called protruding end, sticky end, overhang, cohesive 
end. 

Extracellular matrix The organized carbohydrate and pro¬ 
tein structure that is secreted by animal cells when they are 
part of a tissue. 

Extrachromosomal DNA A replicatable DNA element that is 
not part of a chromosome. 

Exudate A low-molecular-weight compound (sugar, amino 
acid, etc.) that leaks out of plant tissues, including seeds, 
roots, and leaves. 


False negative A test result that does not recognize a target 
when it is present in a sample. 

False positive A test result that indicates the presence of a 
target when it is not in a sample. 

Fed-batch fermentation Growth of cells or microorganisms 
during which nutrients are added periodically to the biore¬ 
actor. 

Feedback inhibition See End-product inhibition. 

Fermentation (1) In chemical engineering, the growth of 
cells or microorganisms in specialized vessels (fermenters or 
bioreactors). (2) In biochemistry, the breakdown of carbon 
compounds by cells or organisms to synthesize ATP without 
using molecular oxygen. 

Fermenter See Bioreactor. 

Ferredoxin An iron-sulfur protein that acts as an electron 
carrier. 
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Ferritin A family of iron storage proteins found in animals, 
plants, fungi, and bacteria. 

Fertile Capable of breeding and reproduction. 

Fibroblast A spindle-shaped cell found in connective tissue. 

5' extension A short single-stranded nucleotide sequence on 
the 5' phosphate end of a double-stranded DNA molecule. 
Also called 5' protruding end, 5' sticky end, 5' overhang. 

5' phosphate end The phosphate group that is attached to 
the 5' carbon atom of the sugar (ribose or deoxyribose) of the 
terminal nucleotide of a nucleic acid molecule. 

Flavonoid A member of a class of plant phenolic compounds 
containing two aromatic rings connected by a three-carbon 
bridge. 

Fluorescein A fluorescent dye often used to label antibodies 
so that they may be visualized after they have reacted with 
antigens in cells. 

Fluorescence-activated cell sorting A technique used to 
separate cells based on the amount of fluorescence they emit. 
Also called FACS. 

Fluorography Detection of the emission of light from a 
labeled source on X-ray film. 

Fluorophore That portion of a molecule that can fluoresce. 
Flush end See Blunt end. 

Flush-end cut See Blunt-end cut. 

Foliar Consisting of or pertaining to leaves. 

Foreign DNA A DNA molecule that is incorporated into 
either a cloning vector or a chromosomal site. 

Fosmid A cloning vector based on the F plasmid of Escherichia 
coli that can carry large segments of cloned DNA. 

Fouling The coating or plugging of equipment by materials or 
microorganisms that prevents it from functioning properly. 

Founder animal In transgenesis research, an organism that 
carries a transgene in its germ line and that can be used in 
matings to establish a pure-breeding transgenic line or one 
that acts as a breeding stock for transgenic animals. 

Four-cutter A type II restriction endonuclease that binds and 
cleaves DNA at sites that contain four nucleotide pairs. 

Frameshift mutation In chromosomal DNA, an insertion or 
deletion of base pairs that changes the reading frame of a 
gene. 

Fructans Polymers of fructose that are not degraded in the 
human digestive tract. 

Functional gene cloning A strategy for isolating a gene that 
depends on information about its product. 

Functional genomics The large-scale study of gene expres¬ 
sion. 

Functional protein microarray An array, composed of as 
many members of a proteome as possible, that is used to 
study the activities of a proteome. 


Fusion protein The product of two or more coding sequences 
from different genes that have been cloned together and that, 
after translation, form a single polypeptide sequence. Also 
called hybrid protein, chimeric protein. 

G A guanine residue in either DNA or RNA. 

G cap The 5'-terminal methylated guanine nucleotide that is 
present on many eukaryotic mRNAs; it is joined, after tran¬ 
scription, to the mRNA in a 5'-to-5' linkage. 

Gamete A cell with a haploid chromosome content. In ani¬ 
mals, a sperm or egg; in plants, pollen or ovum. 

Gametocyte A eukaryotic germ cell that divides by mitosis 
to form additional gametocytes or by meiosis to form 
gametids. Male gametids are spermatids, while female 
gametids are ootids. 

Gap A missing internal segment of one strand of duplex 
DNA. 

Gapped DNA A duplex DNA molecule with one or more 
internal single-stranded regions. 

Gateway cloning technology A system that utilizes the 
attachment sites used by bacteriophage X for integration and 
excision into and out of the Escherichia coli chromosome for 
cloning DNA, especially coding sequences into expression 
vectors. 

Gel matrix A semisolid macromolecular lattice that is used 
for the electrophoretic fractionation of macromolecules. 

Gelatinization Steam cooking of milled grain, a process that 
increases the surface area of the starch and converts the 
original mash into a material with a gel-like consistency. 

Gene A segment of nucleic acid that encodes a functional 
protein or RNA. The unit of inheritance. 

Gene bank A population of organisms, each of which carries 
a DNA molecule that was inserted into a cloning vector. 
Ideally, all of the cloned DNA molecules represent the entire 
genome of another organism. Also called gene library, clone 
bank, bank, library. This term is sometimes also used to 
denote all of the vector molecules, each carrying a piece of the 
chromosomal DNA of an organism, before the insertion of 
these molecules into a population of host cells. 

Gene cloning Insertion of a gene into a DNA vector (often a 
plasmid) to form a new DNA molecule that can be perpetu¬ 
ated in a host cell. Also called recombinant DNA technology, 
genetic engineering, gene splicing, gene transplantation, 
molecular cloning, cloning. 

Gene expression Synthesis of RNA, and often protein, 
directed by the nucleotide sequence in a specific segment of 
DNA (gene). 

Gene flow The transfer of genes from one population to 
another through interbreeding. 

Gene library See Gene bank. 

Gene map The linear array of genes of a chromosome. 
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Gene therapy Use of a gene or cDNA to treat a disease. 

Generally regarded as safe In the United States, a designa¬ 
tion given to foods, drugs, and other materials that have been 
used for a considerable period and that have a history of not 
causing illness in humans, even though extensive toxicity 
testing has not been conducted. More recently, certain host 
organisms for recombinant DNA experimentation have been 
given this status. 

Generation time The time that it takes for a population of 
single-celled organisms to double its cell number. Also called 
doubling time. 

Genetic code The complete set of 64 codons that code for all 
20 amino acids and 3 termination codons. 

Genetic complementation When two genomes or DNA mol¬ 
ecules that are in the same cell produce a function that neither 
genome or DNA molecule can supply on its own. Also called 
complementation. 

Genetic engineering See Gene cloning. 

Genetic heterogeneity The condition or state of a population 
in which different mutant genes produce the same pheno¬ 
type. Also called heterogeneity. 

Genetic immunization Delivery of a cloned gene that 
encodes an antigen to a host organism. After the cloned gene 
is expressed, it elicits an antibody response that protects the 
organism from infection by a virus, bacterium, or other dis¬ 
ease-causing organism. 

Genetic linkage See Linkage. 

Genetic map The linear array of genes on a chromosome 
based on recombination frequencies. Also called linkage 
map. 

Genetic mapping Determining the linear order of marker 
sites along a chromosome. Also called mapping. 

Genetic polymorphism A situation in which two or more 
alleles of a locus in a population of individuals occur at a 
frequency of 1% or greater. Often, in the appropriate context, 
it is simply called polymorphism. 

Genetic test An assay that determines whether the cause of 
a disorder is at the DNA level. 

Genome (1) The entire complement of genetic material of an 
organism, virus, or organelle. (2) The haploid set of chromo¬ 
somes (DNA) of a eukaryotic organism. 

Genomics The study and development of genetic and phys¬ 
ical maps, large-scale DNA sequencing, gene discovery, and 
computer-based systems for managing and analyzing the 
genome of an organism. 

Genotype (1) The genetic constitution of an organism. (2) 
The alleles at a genetic locus. 

Genotyping The determination of the alleles of a chromo¬ 
some of an individual. Also called DNA typing, haplo- 
typing. 

Germ line cells Cells that produce gametes. 


Germ line gene therapy The delivery of a gene(s) to a fertil¬ 
ized egg or an early embryonic cell. The transferred gene(s) is 
present in all nuclei of the cells of the mature individual, 
including the reproductive cells, and alters the phenotype of 
the developed individual. 

Germinal disc A layer of cells on the surface of an egg yolk 
that will form the embryo. 

Gluconeogenesis The synthesis of glucose from noncarbo¬ 
hydrates, such as fat or protein. 

Glutathione A tripeptide comprising the amino acids glu¬ 
tamic acid, cysteine, and glycine that acts as an antioxidant. 

Glycation The nonenzymatic covalent addition of sugar or 
sugar-related molecules to proteins. 

Glycoalkaloids A group of toxic compounds found in some 
plants. 

Glycogen synthase An enzyme that catalyzes one of the 
steps in the biosynthesis of glycogen from glucose. 

Glycosylation The enzymatic covalent addition of sugar or 
sugar-related molecules to proteins or polynucleotides. 

Glyphosate A broad-spectrum herbicide that inhibits the 
synthesis of aromatic amino acids in plants. 

Golden rice Rice that has been genetically engineered to 
express three foreign genes and thereby to synthesize the 
vitamin A precursor (3-carotene, imparting a yellow or golden 
color to the rice grains. 

Golgi apparatus Flattened stacks of membranous sacs that 
process and sort proteins and other macromolecules, espe¬ 
cially those that are secreted by the cell. 

Gram-negative organism Any prokaryotic organism that 
does not retain the first stain (crystal violet) used in the Gram 
technique. It does retain the second stain (safranin O) and 
therefore has a pink color when viewed under a light micro¬ 
scope. Retention of the stain is due to the structure of the cell 
wall. 

Gram-positive organism Any prokaryotic organism that 
retains the first stain (crystal violet) used in the Gram tech¬ 
nique, which gives a purple-black color when viewed under 
a light microscope. Retention of the stain is due to the struc¬ 
ture of the cell wall. 

Gratuitous inducer A substance that can induce transcrip¬ 
tion of a gene(s) but that is not a substrate for the induced 
enzyme(s). 

Green fluorescent protein A protein that emits green fluo¬ 
rescence after excitation by light of a specific wavelength. It is 
produced naturally by the jellyfish Aecjuorea victoria and other 
marine organisms and has a variety of applications in molec¬ 
ular biotechnology. 

Guanine One of the organic bases found in either DNA or 
RNA. 

GUS The bacterial enzyme (3-glucuronidase, which is com¬ 
monly used as a marker in the production of transgenic 
plants. 
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HAC See Human artificial chromosome. 

Hairpin loop A segment of single-stranded DNA or RNA 
that is folded back upon itself and held together by base 
pairing in a structure that is locally double stranded; it may 
be represented on paper as having the appearance of a tradi¬ 
tional lady's hairpin. 

Haploid Having one copy of each autosome and one sex 
chromosome. 

Haplotype The alleles of the loci of a chromosome. A set of 
closely linked genetic markers present on one chromosome 
that tend to be inherited together. The word is derived by 
combining "haplo" from haploid and "type" from genotype. 

Haplotyping See Genotyping. 

Heat shock proteins The proteins synthesized in the nearly 
universal response of organisms to environmental stress, 
such as high temperature. 

Helper cells T cells that assist other cell types to respond 
immunologically to the presence of an antigen. Also called 
helper T lymphocytes. 

Helper plasmid A plasmid that provides a function(s) to 
another plasmid in the same cell. Some helper plasmids are 
used to mobilize nonconjugative plasmids from a donor cell 
into a recipient cell. 

Helper virus A virus that provides a function(s) to another 
virus or viral genome in the same cell. 

Hemocoel Spaces between the organs of organisms with 
open circulatory systems, like most arthropods and mol- 
lusks. 

Hemorrhagic colitis A type of gastroenteritis in which cer¬ 
tain strains of the bacterium Escherichia coli infect the large 
intestine and produce a toxin (Shiga toxin) that causes bloody 
diarrhea and other serious complications. Hemorrhagic colitis 
occurs in people of all ages but is most common among chil¬ 
dren and older people. 

Hepatocyte A cell that makes up the major mass of the liver 
and that is involved in synthesis of cholesterol, bile salts, and 
phospholipids as well as detoxification, modification, and 
excretion of various substances. 

Heterochromatin Tightly compacted regions of chromatin 
that are usually transcriptionally silent. 

Heterogeneity See Genetic heterogeneity. 

Heterologous From a different source, as in heterologous 
DNA. 

Heterologous probe A DNA probe that is derived from one 
organism and used to screen for a similar DNA sequence in a 
clone bank derived from another organism. 

Heterologous protein See Recombinant protein. 

Heteromer A protein with two or more different protein 
chains. Also called heteromeric polypeptide, heteromeric 
protein. 


Heterozygote An individual that has different alleles at the 
same locus in its two homologous chromosomes. 

HGP See Human Genome Project. 

High-resolution map A genetic or physical map with closely 
spaced sites throughout. 

Histone A protein that binds to DNA and aids in the com¬ 
paction of chromosomes in the nucleus. 

Holoenzyme A catalytically active enzyme containing all of 
the necessary cofactors and subunits. 

Homodimer A protein with two identical polypeptide 
chains. 

Homologous From the same source or having the same evo¬ 
lutionary function or structure. 

Homology Similarity due to a common origin. 

Homomer A protein with two or more identical protein 
chains. Also called homomeric polypeptide, homomeric pro¬ 
tein. 

Homopolymer A nucleic acid strand that is composed of one 
kind of nucleotide. 

Homopolymeric tailing See Tailing. 

Homozygote An individual that has identical alleles at the 
same locus in its two homologous chromosomes. 

Hormone A chemical secreted by specialized cells that signal 
specific functions in target cells. 

Host A microorganism, organism, or cell that maintains a 
cloning vector. 

Human artificial chromosome A chromosome that is assem¬ 
bled from telomere, centromere, and human genomic DNA 
sequences. Also called HAC, microchromosome. 

Human cytomegalovirus A virus belonging to the herpes¬ 
virus group. 

Human Genome Project An international research effort 
dedicated to developing high-resolution human genetic and 
physical maps and the complete genomic DNA sequences of 
humans and other organisms. Also called HGP. 

Human minisatellite DNA Human DNA that is noncoding 
and generally G+C rich and that contains tandem repeats of 
short (9- to 40-base-pair) stretches of DNA. 

Humanized Having segments of a cloned (antibody) gene, 
usually from mice, replaced by comparable regions of human 
DNA. Such a recombinant protein is less likely to be recog¬ 
nized as a foreign protein when it is used as a human thera¬ 
peutic agent. 

Humoral immune response The production of antibody by 
B cells of the immune system in response to the presence of a 
foreign antigen. 

Hybrid gene The combination of two genes or parts of two 
genes in the correct reading frame that encodes a single pro¬ 
tein that has amino acid sequences from both genes. 
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Hybrid selection A protocol for determining which genomic 
clones hybridize to a cDNA or mRNA molecule. 

Hybridization The pairing of two polynucleotide strands, 
often from different sources, by hydrogen bonding between 
complementary nucleotides. 

Hybridoma The product of the fusion of a myeloma cell with 
an antibody-producing lymphocyte. This cell combination 
(hybridoma) can continue to divide in cell culture and secrete 
a single type of antibody. 

Hydrogen bond A weak chemical bond that is formed when 
a hydrogen atom of a polar molecule that has a partial posi¬ 
tive charge is attracted to an atom of another polar molecule 
that has a partial negative charge. 

Hydrogen uptake positive A term describing a microor¬ 
ganism that is capable of assimilating (taking up) hydrogen 
gas. Also called Hup + . 

Hydromechanical stress See Shear. 

Hydroponics The growth of plants in liquid medium rather 
than in soil. 

Hydroxyl group A functional group consisting of an oxygen 
atom and a hydrogen atom (-OH). 

Hyperaccumulator A plant that can naturally accumulate 
large amounts of metal from the environment. These plants 
are often used for the phytoremediation of metals. 

Hypervariable region The parts of both the heavy and light 
chains of an antibody molecule that enable it to bind to a 
specific site on an antigen. 

Hypervariable segment A region of a protein that varies 
considerably between strains or individuals. 

Hyphae Filamentous cells of some fungi. 

IAA See Indole-3-acetic acid. 

Ice nucleation protein A protein around which ice crystals 
form. 

Ice-minus bacteria Bacteria that do not synthesize ice nucle¬ 
ation proteins. 

I/E region See Integration-excision region. 

Ileum The lower region of the small intestine just before the 
large intestine. 

Immediate-early gene A viral gene that is expressed 
promptly after infection. 

Immune response The processes, including the synthesis of 
antibodies, that are used by vertebrates to respond to the 
presence of a foreign antigen. 

Immunoaffinity chromatography A purification technique 
in which an antibody is bound to a matrix and is subse¬ 
quently used to bind a specific protein and separate it from a 
complex mixture. 


Immunoassay A protocol that uses antibody specificity to 
detect the presence of a particular compound in a biological 
sample. 

Immunogen A substance that induces an antibody response. 
Also called antigen. 

Immunoglobulin See Antibody. 

Immunosuppression Prevention or diminishment of the 
immune response by a substance, agent, or condition. 

Immunotherapeutic procedure The use of an antibody or a 
fusion protein containing the antigen-binding site of an anti¬ 
body to treat a disease and enhance the well-being of a 
patient. 

Immunotoxin A fusion protein that has separate domains 
with antibody and toxin activities. The antibody portion of 
the molecule facilitates binding to a target molecule or cell, 
and the toxin inactivates the target molecule or kills the cell. 

Impeller An agitator that is used for mixing the contents of 
a bioreactor. 

In vitro mutagenesis See Directed mutagenesis. 

In vitro translation Protein synthesis that is directed by either 
purified DNA with bacterial extracts or mRNA with wheat 
germ or rabbit reticulocyte extracts that provide ribosomes, 
tRNAs, and protein synthesis factors. The reaction mixture is 
often supplemented with ATP, GTP, and amino acids. 

In vivo gene therapy The delivery of a gene(s) to a tissue or 
an organ of an individual to alleviate a genetic disorder. 

Inactivated agent A virus, bacterium, or other organism that 
has been treated to prevent it from causing a disease. 

Inclusion body A protein that is overproduced in a recombi¬ 
nant bacterium and forms a crystalline array of mostly inac¬ 
tive protein inside the bacterial cell. 

Incompatibility group A classification scheme indicating 
which plasmids can coexist within a single cell. Plasmids 
must belong to different incompatibility groups to coexist 
within the same cell. Plasmids that belong to the same incom¬ 
patibility group are unstable when placed into the same cell. 
A plasmid cloning vector should always belong to an incom¬ 
patibility group different from that of the host bacterium's 
endogenous plasmids. 

Independent assortment The formation of all possible gene 
combinations in gametes with genes on different chromo¬ 
somes, followed by the random joining of male and female 
gametes. Also called Mendel's second law of inheritance. 

Indole-3-acetic acid A plant hormone which stimulates both 
rapid responses, such as increases in cell elongation, and 
long-term effects, such as increases in cell division and dif¬ 
ferentiation. Abbreviated IAA. 

Inducer A low-molecular-weight compound or a physical 
agent that interacts with or alters a repressor protein and 
prevents it from blocking transcription. 
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Induction Turning on the transcription of a specific gene or 
operon. The consequences of the action of an inducer. 

Infectious agent Often, a proliferating virus, bacterium, or 
parasite that causes a disease in plants or animals. 

Informatics See Bioinformatics. 

Initiation The start of the biosynthesis of a polymeric macro¬ 
molecule. 

Initiation codon The codon AUG, which specifies the first 
amino acid (methionine [N-formylmethionine in prokary¬ 
otes]) of a protein. Also called initiator, translational start 
codon, translational initiation signal. 

Initiation complex The fMet-tRNA fMet -mRNA-small ribo- 
somal subunit-large ribosomal subunit combination in pro¬ 
karyotes or the Met-tRNA Met -mRNA-small ribosomal 
subunit-large ribosomal subunit combination in eukaryotes 
that is ready for the elongation phase of translation. 

Initiator See Initiation codon. 

Initiator element See Response element. 

Initiator tRNA The fMet-tRNA ,Mef in prokaryotes or Met- 
tRNA Met in eukaryotes that starts translation. 

Insecticide A substance or living organism that kills insects. 

Insert A DNA molecule that is incorporated into a cloning 
vector. 

Insulin A small peptide hormone secreted by the pancreas 
that regulates glucose uptake by cells and other aspects of 
carbohydrate metabolism. 

Integrating vector A vector that is designed to integrate 
cloned DNA into the host cell chromosomal DNA. 

Integration Insertion of a DNA molecule (usually by homol¬ 
ogous recombination) into a chromosomal site. 

Integration-excision region The portion of bacteriophage X 
DNA that enables bacteriophage X DNA to be inserted into a 
specific site in the Escherichia coli chromosome and excised 
from this site. 

Intein An internal segment of a protein that catalyzes its 
own excision from a protein precursor. Used in the construc¬ 
tion of self-cleaving fusion proteins. 

Interactome An extensive set of interacting proteins. 

Interleukin-2 A lymphokine secreted by certain T lympho¬ 
cytes that stimulates T-cell proliferation. 

Internal ribosomal entry site A nontranslated sequence fol¬ 
lowing a coding region of a polycistronic RNA that binds to a 
small ribosomal subunit and forms an initiation-of-transla- 
tion complex. 

Intervening sequence See Intron. 

Intron A segment of a gene that is transcribed but is then 
excised from the primary transcript during processing into a 
functional RNA molecule. Also called intervening sequence. 

Ion channel An integral protein within a cell membrane that 
facilitates selective ion transport. 


IPTG Isopropyl-|3-D-thiogalactopyranoside, an inducer of 
the lac (lactose) operon. In recombinant DNA technology, 
IPTG is often used to induce cloned genes that are under the 
control of the lac repressor -lac promoter system. 

IRES See Internal ribosomal entry site. 

Ischemia Inadequate blood supply to a local area due to 
blockage of the blood vessels to that area. 

Isoelectric point The pH at which the net charge on a protein 
or other molecule is zero. 

Isoschizomers Restriction enzymes that recognize and bind 
to the same nucleotide sequence in DNA and cut at the same 
site. 

Isopropyl-P-D-thiogalactopyranoside See IPTG. 

Jejunum The portion of the mammalian small intestine 
between the duodenum and the ileum. The jejunum is lined 
with small outgrowths called villi that facilitate the absorp¬ 
tion of digested material. 

kb See Kilobase pair. 

k cat The catalytic rate constant that characterizes an enzyme- 
catalyzed reaction. The higher the k cat , the faster the conver¬ 
sion of substrate into product. 

k c JK m The catalytic efficiency of an enzyme-catalyzed reac¬ 
tion. The higher the value of k c J K m , the more rapidly and 
efficiently the substrate is converted into product. 

Ketosynthase A low-molecular-weight enzyme involved as 
part of a larger complex in polyketide biosynthesis. 

Kilobase pair One thousand base pairs; a unit of length of 
DNA. Abbreviated kb. 

Kindred A group of individuals who are related to each 
other either genetically or by marriage. Also called kinship. 

Klenow fragment A product of proteolytic digestion of the 
DNA polymerase I from Escherichia coli that retains both poly¬ 
merase and 3' exonuclease activities but not 5' exonuclease 
activity. 

Kluyvera ascorbata A free-living gram-negative soil bacte¬ 
rium that can act as a plant growth-promoting bacterium. 

K m A dissociation constant that characterizes the binding of 
an enzyme to a substrate. The lower the K m , the tighter the 
binding of the enzyme to the substrate. Also called Michaelis 
constant. 

Knockout The targeted disruption of a gene by homologous 
recombination. Also called gene knockout. 

Kozak sequence A specific sequence of nucleotides sur¬ 
rounding the start codon in higher eukaryotic organisms. 


Label A compound or atom that is either attached to or 
incorporated into a macromolecule and is used to detect the 
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presence of a compound, substance, or macromolecule in a 
sample. Also called tag. 

Large ribosomal subunit The larger component of a ribo¬ 
some. 

Latent agent Usually a virus that is present in a host 
organism but does not produce any symptoms. 

Leader peptide See Signal sequence. 

Leader sequence A sequence of nucleotides at the 5' end of 
an mRNA that is not translated into protein. 

Lectin One of a group of plant proteins that can bind to spe¬ 
cific oligosaccharides on the surface of a cell. Lectins are often 
found in seeds, where they act as a toxin against certain 
pathogenic agents. 

Lentiviral vector A retroviral vector, derived from the lenti- 
virus, that can be used to deliver a gene of interest into the 
genomes of dividing and nondividing host cells. 

Library See Gene bank. 

Ligand A molecule that specifically binds to a larger mole¬ 
cule. 

Ligase chain reaction A technique for determining the pres¬ 
ence or absence of a specific nucleotide pair within a target 
gene. 

Ligation Joining of two DNA molecules by the formation of 
phosphodiester bonds. In vitro, this reaction is usually cata¬ 
lyzed by the enzyme T4 DNA ligase. 

Lignocellulose The combination of lignin, hemicellulose, 
and cellulose that forms the structural framework of plant 
cell walls. 

Linkage The occurrence of two or more genes on the same 
chromosome. 

Linkage map See Genetic map. 

Linker A synthetic double-stranded oligonucleotide that car¬ 
ries the sequence for one or more restriction endonuclease 
sites. 

Lipase An enzyme that degrades lipids. 

Lipofection Delivery into eukaryotic cells of DNA, RNA, or 
other compounds that have been encapsulated in an artificial 
phospholipid vesicle. 

Lipopolysaccharide A compound containing lipid bound to 
a polysaccharide; often a component of microbial cell walls. 
Also called LPS. 

Lipoprotein A compound containing both lipid and protein; 
the main structural material of cell membranes. 

Liposome A spherical particle of lipid molecules in which 
the hydrophobic portions of the molecule are facing inward; 
a lipid vesicle with an aqueous interior that can carry nucleic 
acids, drugs, or other therapeutic agents. 

Liquefaction Enzymatic digestion (often by a-amylase) of 
gelatinized starch to form lower-molecular-weight polysac¬ 
charides. 


Liquid chromatography A technique used to separate mol¬ 
ecules that are dissolved in a solution. Separation occurs as 
the molecules in the solution (the mobile phase) pass through 
a column containing a second solution (the stationary phase) 
because the molecules interact with the second solution to 
different extents and therefore pass through the column at 
different rates. 

Live vaccine (1) A living, nonvirulent form of a microor¬ 
ganism or virus that is used to elicit an antibody response 
that protects the inoculated organism against infection by a 
virulent form of the microorganism or virus. (2) A living, 
nonvirulent microorganism or virus that expresses a foreign 
antigenic protein and is used to inoculate a human or animal. 
The latter type of organism is also called a live recombinant 
vaccine. 

Locus The site on the chromosome where a specific gene is 
located. 

Long template A DNA strand that is synthesized during the 
polymerase chain reaction, has a primer sequence at one end, 
and is extended beyond the site that is complementary to the 
second primer at the other end. 

Long terminal repeats Similar blocks of genetic information 
that are found at the ends of the genomes of retroviruses. 
Also called LTRs. 

LTRs See Long terminal repeats. 

Luciferase In fireflies, the enzyme, encoded by the luc gene, 
that catalyzes the oxidation of luciferin and subsequent emis¬ 
sion of light in a bioluminescence reaction. In bacteria, the 
enzyme, encoded by lux A to luxE, that catalyzes the produc¬ 
tion of light. 

Luciferin In fireflies, the compound 4,5-dihydro-2-(-6-hy- 
droxy-2-benzothiazolyl)-4-thiazole carboxylic acid, which is 
oxidized in the reaction used to produce light. 

Lupine A grain legume that is grown extensively in Australia 
and New Zealand. 

Lycopene A carotene photosynthetic pigment. Lycopene 
from tomatoes is thought to act as an antioxidant. 

Lymphoma Cancer originating in the lymph nodes, spleen, 
and other lymphoreticular sites. 

Lysis The destruction or breakage of cells by (1) viruses or (2) 
chemical or physical treatment. 

Lysogeny A condition in which a bacteriophage genome 
(prophage) survives within the host chromosome and lytic 
functions are repressed. 

Lytic cycle The process of viral production that usually leads 
to cell lysis. 

MAb See Monoclonal antibody. 

Macromolecule A molecule with a high molecular mass, 
such as a nucleic acid, protein, or polysaccharide. 

Macrophage A large phagocytic white blood cell. 
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Macula Part of the retina of the vertebrate eye. 

Maltose-binding protein An abundant bacterial protein 
located within the periplasmic space and involved in the 
uptake of maltose. 

Marker An identifiable DNA sequence on a chromosome. 
Also called marker site, marker locus, genetic marker. 

Marker peptide A portion of a fusion protein that facilitates 
its identification or purification. 

Mass spectrometry Measurement of the mass-to-charge ratio 
of ions. 

Matrix-assisted laser desorption ionization (MALDI) A 

mass spectrometry technique often used to determine the 
masses of peptides. The peptides are mixed with a matrix 
consisting of an organic acid and then ionized with a laser. 
The ions are accelerated through a tube by using a high- 
voltage current, and the time required for them to reach the 
ion detector is used to determine their molecular mass; 
lower-mass ions reach the detector first. 

Mb See Megabase pair. 

MCS See Polylinker. 

Medaka The small freshwater teleost (bony fish) Oryzias 
latipes. 

Megabase pair 1,000,000 base pairs; a unit of length of DNA. 
Abbreviated Mb. 

Melittin A 26-amino-acid peptide with antimicrobial and 
hemolytic activities that is the major component of bee 
venom. 

Meristematic tissue Plant tissue that is actively dividing. In 
young plants, it is usually found at the tips of the stem and 
the roots. 

Merozoite A daughter cell of a protozoan parasite. It is the 
result of asexual reproduction. 

Mesophile A microorganism that is able to grow within the 
temperature range of 20 to 50°C; optimal growth often occurs 
at about 37°C. 

Messenger RNA An RNA molecule carrying the information 
that, during translation, specifies the amino acid sequence of 
a protein molecule. Also called mRNA. 

Metabolite (1) Alow-molecular-weightbiological compound 
that is usually synthesized by an enzyme. (2) A compound 
that is essential for a metabolic process. 

Metabolome The complete repertoire of metabolites of a cell, 
tissue, or organism. 

Metagenomics The study of the collective genomes in an 
environmental sample. 

Methylation The addition of a methyl group(s) to a macro¬ 
molecule; for example, the addition of a methyl group to 
specific cytosine and, occasionally, adenine residues in DNA. 

Michaelis constant See K m . 


Microchromosome See Human artificial chromosome. 

Microinjection The introduction of DNA or other com¬ 
pounds into a single eukaryotic cell with a fine microscopic 
needle. 

Microprojectile bombardment See Biolistics. 

Minitransposon A smaller version of a transposon con¬ 
taining only a portion of the DNA carried by the transposon. 

Mismatch The lack of base pairing between one or more 
nucleotides of two hybridized nucleic acid strands. 

Missense mutation A genetic mutation that changes a codon 
for one amino acid into a codon specifying another amino 
acid. 

Mobilizing functions The genes on a plasmid that facilitate 
the transfer of either a nonconjugative or a conjugative 
plasmid from one bacterium to another. 

Modification (1) Enzymatic methylation of a restriction 
enzyme DNA recognition site. (2) A specific nucleotide 
change in a DNA or RNA molecule. 

Molecular mass The sum of the masses of all the atoms in a 
molecule, usually expressed in daltons or kilodaltons. 

Monochromosomal Referring to the presence of a single 
human chromosome in a somatic cell hybrid line. 

Monoclonal antibody A single type of antibody that is 
directed against a specific epitope (antigenic determinant) 
and is produced by a hybridoma cell line, which is formed by 
the fusion of a lymphocyte with a myeloma cell. Some 
myeloma cells synthesize single antibodies naturally. Also 
called MAb. 

Monocotyledon A class of plants that have one seed leaf. 
Also called monocot. 

Monomer A unit of a polymer. 
mRNA See Messenger RNA. 

M13 + strand The single-stranded DNA molecule that is 
present in the infective bacteriophage M13. 

Mucosal immunity Protection from the pathogenic effects of 
foreign microorganisms or antigenic substances as a result of 
antibody secretions of the mucous membranes. Mucosal epi- 
thelia in the gastrointestinal, respiratory, and reproductive 
tracts produce a form of immunoglobulin A that protects 
these ports of entry into the body. 

Multigene RNA An RNA transcript of an operon. 

Multiple cloning site See Polylinker. 

Multiplex assay Simultaneous determination of a large 
number of different targets in one reaction vessel or by one 
analytical procedure. 

Multipoint linkage analysis The determination of the order 
and map distances of many loci on a chromosome at one 
time. Also called multilocus mapping, multipoint mapping, 
multilocus linkage analysis. 
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Multivalent vaccine A single vaccine that is designed to 
elicit an immune response either to more than one infectious 
agent or to several different epitopes of a molecule. 

Must The juice from crushed grapes before fermentation; 
used in wine making. 

Mutagenesis Chemical or physical treatment that changes 
the nucleotides of the DNA of an organism. 

Mutant An organism that differs from the wild type because 
it carries one or more genetic changes in its DNA. Also called 
a variant. 

Mutant complementation See Genetic complementation. 

Mutation A change of one or more nucleotide pairs of a 
DNA molecule. 

Mutation detection assay A protocol that identifies the dif¬ 
ference of one or a few nucleotide pairs between the same 
DNA molecules from different sources. 

Mycelium A mass of interwoven threadlike filaments of a 
fungus or bacterium. 

Myeloma A plasma cell cancer. 

Mycotoxin A toxin produced by a fungus. 

N terminus The first amino acid(s) of a protein. Also called 
amino terminus, amino-terminal end. 

Narrow-host-range plasmid A plasmid that can replicate in 
one, or at most a few, different bacterial species. 

Native protein The naturally occurring form of a protein. 

Nebulization A process in which genomic DNA is frag¬ 
mented by forcing it through a small pore, thereby creating a 
fine spray. 

Negative control A system of regulation of transcription that 
requires the removal of a repressor from an operator. 

Neomycin phosphotransferase An enzyme that inactivates 
the antibiotics neomycin and kanamycin. This enzyme is 
often used as a selective marker for transgenic plants. 

Neonate A newborn. 

Neoschizomers Restriction enzymes that recognize and bind 
to the same nucleotide sequence in DNA and cut at different 
sites. 

Neutralizing antibody An antibody that reacts with an 
infectious agent (e.g., a virus) and destroys or inhibits its 
infectivity or virulence. 

Neutrophil A type of white blood cell that fights infection. A 
form of granulocyte, filled with neutrally staining granules, 
which are tiny sacs of enzymes that help the cell to kill and 
digest microorganisms. 

Nick (1) To break a phosphodiester bond in the backbone of 
one of the strands of a duplex DNA molecule. (2) A break in 
the backbone of one of the strands of duplex DNA. 

Nicotine A colorless, poisonous alkaloid found in tobacco 
that is sometimes used as an insecticide. 


Nitrilase An enzyme that catalyzes the hydrolysis of nitriles 
to carboxylic acids and ammonia. 

Nitrogen fixation The conversion of atmospheric nitrogen to 
ammonia. Biological nitrogen fixation is catalyzed by the 
enzyme nitrogenase, which is found only in prokaryotes. 

Nod box A DNA sequence that controls the transcriptional 
regulation of Rhizobium nodulation genes. 

Nodulation The formation of nodules by symbiotic bacteria 
on the roots of plants. 

Nonautologous From a different species or individual. 

Nonreducing end The end of a cellulose strand that cannot 
act as a reducing agent; it does not contain an aldehyde 
moiety. 

Non virulent agent See Attenuated vaccine. 

Northern blotting Similar to Southern blotting, except that 
RNA which has been separated by gel electrophoresis is 
transferred from a gel onto a matrix, such as a nitrocellulose 
or nylon membrane, and the presence of a specific RNA mol¬ 
ecule is detected by DNA-RNA hybridization. 

Nuclear cloning Production of an organism by placing a 
nucleus from a somatic cell into an enucleated fertilized egg. 
Also called nuclear transfer. 

Nucleocapsid The nucleic acid genome and surrounding 
protein coat of a virus. 

Nucleoside A base (purine or pyrimidine) that is covalently 
linked to a five-carbon (pentose) sugar. When the sugar is 
ribose, the nucleoside is a ribonucleoside; when it is deoxyri- 
bose, the nucleoside is a deoxyribonucleoside. 

Nucleotide A nucleoside with one or more phosphate groups 
linked to the 5' carbon of the pentose sugar. Ribose-containing 
nucleosides are often called ribonucleoside monophosphate 
(NMP), ribonucleoside diphosphate (NDP), or ribonucleo¬ 
side triphosphate (NTP). When the nucleoside contains the 
sugar deoxyribose, the nucleotides are called deoxyribonu¬ 
cleoside mono-, di-, or triphosphates (dNMP, dNDP, or 
dNTP). 

Nucleus The membrane-enclosed structure in eukaryotic 
cells that contains the genetic material. 

Off state A state in which a gene is not being transcribed. 
OLA See Oligonucleotide ligation assay. 

Oleosins Hydrophobic oil body proteins associated with 
plant seeds. 

Oligonucleotide A short molecule (usually 6 to 100 nucle¬ 
otides) of single-stranded DNA. Oligonucleotides are some¬ 
times called oligodeoxyribonucleotides or oligomers and are 
usually synthesized chemically. 

Oligonucleotide ligation assay A diagnostic technique for 
determining the presence or absence of a specific nucleotide 
pair within a target gene, which indicates whether a gene is 
wild type (normal) or mutant (defective). Abbreviated OLA. 
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Oligonucleotide-directed mutagenesis See Site-specific mu¬ 
tagenesis. 

On state A state in which a gene is being transcribed. 

Oncogene A gene that plays a role in the cell division cycle. 
Often, mutated forms of oncogenes cause a cell to grow in an 
uncontrolled manner. 

Oncomouse A transgenic mouse that carries an activatable 
gene that makes it susceptible to tumor formation. 

Oocyte An egg produced by female ovaries. 

Open reading frame A sequence of nucleotides in a DNA 
molecule that encodes a peptide or protein. This term is often 
used when, after the sequence of a DNA fragment has been 
determined, the function of the encoded protein is not 
known. Abbreviated ORE 

Operator The region of DNA that is upstream from a 
prokaryotic gene(s) and to which a repressor or activator 
binds. 

Operon A cluster of genes that are coordinately regulated. 

Opine The condensation product of an amino acid with 
either a keto acid or a sugar. 

ORF See Open reading frame. 

ORFeome A large collection of cloned open reading frames 
of a proteome. 

ori See Origin. 

Origin The nucleotide sequence at which DNA synthesis is 
initiated. Also called origin of replication, ori. 

Origin of replication See Origin. 

Orthologues Sequences in different species that have a 
common origin. 

Osmolyte A compound that regulates the osmotic pressure 
within a cell. 

Outflow The volume of growing cells that is removed from 
a bioreactor during a continuous fermentation process. 

Overhang See Extension. 

Ovule The structure in seed plants that contains the female 
reproductive cells and develops into a seed after fertiliza¬ 
tion. 

Oxidative phosphorylation A series of enzyme-catalyzed 
reactions in which the acetyl moiety of acetyl coenzyme A is 
converted to carbon dioxide and water with the concomitant 
synthesis of ATR 

Ozone A gaseous form of oxygen, 0 3 . Used for sterilizing 
water, purifying air, and bleaching. 

P site See Peptidyl site. 

PAC See PI artificial chromosome. 

Packaging cell line A cell line that is designed to produce 
viral particles that do not contain infective nucleic acid. This 


process has been described as "putting a sheep in wolf's 
clothing." 

Palindromic sequences Complementary DNA sequences 
that are the same when each strand is read in the same direc¬ 
tion (e.g., 5' to 3'). These types of sequences serve as recogni¬ 
tion sites for type II restriction endonucleases. 

Panicle A pyramidal, loosely branched flower cluster. 

Pantoea agglomerans A gram-negative bacterium typically 
found on plant surfaces. It is also an opportunistic human 
pathogen. 

Parallelization Performing a large number of reactions, such 
as sequencing reactions, simultaneously. 

Paralogue A sequence that arose by duplication within a spe¬ 
cies. 

Parasporal crystal Tightly packaged insect protoxin mole¬ 
cules that are produced by strains of Bacillus thuringiensis 
during the formation of resting spores. 

Partial digest Treatment of a DNA sample with a type II 
restriction endonuclease under conditions that result in a 
limited number of cuts in each DNA molecule to yield many 
possible combinations of cleaved pieces in the final sample. 

Passaging Subculturing cells that are growing in vitro. 

Passive immunity (1) Natural acquisition of antibodies by 
the fetus or newborn from the mother. (2) The artificial intro¬ 
duction of specific antibodies by the injection of serum from 
an immune animal. In both cases, it confers temporary pro¬ 
tection on the recipient. 

Patatin A storage protein commonly found in potatoes. 

Patent A government-issued document that allows the 
holder the exclusive right to manufacture, use, or sell an 
invention for a defined period, usually 20 years. 

Pathogenesis-related promoter The promoter for a plant 
gene whose transcription is activated upon infection of the 
plant by pathogens. 

PCR See Polymerase chain reaction. 

Pedigree A diagrammatic representation of the history of a 
trait in a multigeneration family. 

Peptide A short chain of amino acids that are linked with 
peptide bonds. 

Peptide bond The covalent bond between the a-carboxyl 
group of one amino acid and the a-amino group of an adja¬ 
cent amino acid in a peptide or protein. 

Peptide vaccine A short chain of amino acids that can induce 
antibodies against a specific infectious agent. 

Peptidyl site The portion of a ribosome where the tRNA 
with the peptide chain participates in peptide bond forma¬ 
tion with the aminoacyl-tRNA during translation. Also called 
P site. 

Peptidyl-tRNA The tRNA that has a growing peptide chain 
attached to it during translation. 
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Periplasm The space (periplasmic space) between the cell 
(cytoplasmic) membrane of a bacterium or fungus and the 
outer membrane or cell wall. 

Peritrophic membrane A semipermeable chitinous matrix 
lining the gut of most insects that is thought to be important 
in the maintenance of insect gut structure, facilitation of 
digestion, and protection from invasion by microorganisms 
and parasites. 

Permeate In cross-flow filtration, the solution that passes 
through the membrane. 

Peroxidase Enzymes that, together with hydrogen peroxide, 
catalyze the oxidation of certain organic compounds. 

Pesticide A chemical or biological agent that is used to kill 
pests. Pesticides are often applied to crops to control weeds 
and to reduce predation by insects or pathogenic microorgan¬ 
isms. 

Phage See Bacteriophage. 

Pharmaceutical agent See Therapeutic agent. 

Phase state The coupling or repulsion of two linked genes. 

Phenotype An observable feature or set of traits that is deter¬ 
mined by a gene or combination of genes of an organism. 

Phenylalanine ammonia lyase An enzyme that converts 
phenylalanine to cinnamic acid and tyrosine to p-coumaric 
acid. This enzyme is central to the synthesis of phenylpro- 
panoids, precursors of a range of phenolic compounds, 
including lignin, in plants. 

Phenylketonuria An autosomal recessive disorder in humans 
that is due to the lack of the liver enzyme phenylalanine 
hydroxylase and that causes phenylalanine to accumulate. 
Abbreviated PKU. 

Phosphate An inorganic ion that contains a central phos¬ 
phorus atom linked to four oxygen atoms (P0 4 3- ). 

Phosphodiester bond The linkage of a phosphate group to 
the 3' carbon of one nucleotide and the 5' carbon of another 
nucleotide; the linkage between nucleotides of the same 
nucleic acid strand. 

Phosphoramidite A chemically modified nucleoside used in 
the synthesis of short oligonucleotides. 

Phosphorothioate linkage The linkage between nucleotides 
after a sulfur group replaces an available oxygen of a phos¬ 
phodiester linkage. 

Phosphorylation The addition of a phosphate group to a 
molecule. 

Photolithography A manufacturing process that uses light. 
In the manufacture of some microarrays, the oligonucleotide 
probes are synthesized directly on a solid surface using mul¬ 
tiple rounds of addition of modified nucleotides followed by 
exposure to light to stimulate joining of the nucleotide to a 
growing oligonucleotide chain. Nucleotide addition is pre¬ 
vented in some positions by shielding them from the light. 

Photosynthetic Able to convert atmospheric carbon dioxide 
into organic compounds by using energy from sunlight. 


Nearly all plants, most algae, and some bacteria are photo¬ 
synthetic. 

Phylogeny A prediction of evolutionary relationships among 
organisms that is determined from the comparison of molec¬ 
ular sequences and/or morphological characteristics. 

Physical map A map of the positions of chromosome sites, 
such as restriction endonuclease recognition or sequence- 
tagged sites, on a chromosome. The distance between sites is 
measured in base pairs. 

Phytase An enzyme able to break down phytate (phytic 
acid), a complex compound (inositol hexaphosphate) that is 
the major (60 to 80%) chemical form of phosphorus within 
cereal grains, oilseeds, and their byproducts. Many animals 
cannot digest and utilize the phosphorus within phytate, 
because they lack phytase in their digestive systems. The free 
inositol that is released upon digestion of phytate can chelate 
minerals such as iron, thereby facilitating its uptake. 

Phytoextraction The absorption and concentration of metals 
from the soil into the roots and shoots of plants. 

Phytohormone A substance that stimulates growth or other 
processes in plants; a plant hormone, e.g., auxin, cytokinin, 
gibberellin, ethylene, or abscisic acid. 

Phytopathogen An organism, such as a fungus, bacterium, 
or virus, that causes disease in plants. 

Phytoremediation The use of plants to remove or detoxify 
environmental contaminants, metals, or organic compounds. 

Phytostabilization The use of plants to reduce the spread of 
metals in the environment. 

Phytostimulation The stimulation of microbial biodegrada¬ 
tion of organic compounds in the rhizosphere, the area 
around the roots of plants. 

Phytotransformation The absorption and degradation of 
organic compounds by a plant. 

Phytovolatilization The uptake by a plant of compounds 
from the environment and subsequent release into the atmo¬ 
sphere of volatile materials, such as mercury- or arsenic-con¬ 
taining compounds. 

PKU See Phenylketonuria. 

Plaque A clear area that is visible in a bacterial lawn on an 
agar plate and is due to lysis of the bacterial cells by bacterio¬ 
phage. 

Plasmid An autonomous, self-replicating extrachromosomal 
DNA molecule. 

Plastid In plants, a double-membrane-bound organelle, such 
as a chloroplast. 

Pluripotent See Totipotent. 

Pneumatic reactor See Airlift fermenter. 

Pollen Microspores of plants that carry male gametes. 

Pollination The transfer of pollen to the female reproductive 
organ of a plant, enabling fertilization to take place. 
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Poly(A) tail See Polyadenylation. 

Polyadenylation The addition of adenine residues to the 3' 
ends of eukaryotic mRNAs. Also called poly(A) tailing. The 
adenine-rich 3'-terminal segment is called a poly(A) tail. 

Polyadenylation signal A sequence that terminates tran¬ 
scription and provides a recognition site at the end of an 
mRNA for the enzymatic addition of adenine residues. 

Polycistronic RNA An mRNA that encodes two or more 
proteins. 

Polyclonal antibody A serum sample that contains anti¬ 
bodies that bind to different antigenic determinants of one 
antigen. 

Polyhedron The combination of baculovirus nucleocapsids 
embedded in a matrix protein (polyhedrin). 

Polyhydroxyalkanoate Biodegradable polymers produced 
by microorganisms as a carbon and energy storage material. 

Polyketide A class of antibiotics. 

Polyketide synthase An enzyme involved in the biosyn¬ 
thesis of polyketide antibiotics. 

Polylinker A synthetic DNA sequence that contains a number 
of different restriction endonuclease sites. Also called mul¬ 
tiple cloning site (MCS). 

Polymer A macromolecule made up of a series of covalently 
linked monomers. 

Polymerase chain reaction A technique for amplifying a 
specific segment of DNA by using a thermostable DNA poly¬ 
merase, deoxyribonucleotides, and primer sequences in mul¬ 
tiple cycles of denaturation, renaturation, and DNA synthesis. 
Also called PCR. 

Polymorphic probe An assay system that identifies a chro¬ 
mosome site with two or more allelic DNA sequences that 
each occur with a frequency of 1% (0.01) or greater in a large 
population. 

Polymorphic site A chromosome location that has two or 
more identifiable allelic DNA sequences that each occur with 
a frequency of 1% (0.01) or greater in a large population. Also 
called polymorphic locus. 

Polymorphism Variation in phenotypic or genetic character¬ 
istics among individuals in a population. 

Polynucleotide A linear series of 20 or more nucleotides 
linked by phosphodiester bonds. 

Polypeptide A linear series of amino acids linked together 
with peptide bonds. Also called protein, protein chain. 

Poly(3-hydroxybutyric acid) The best studied and best char¬ 
acterized of the polyhydroxyalkanoates. 

PI artificial chromosome A plasmid vector system based on 
bacteriophage PI that uses electroporation for introducing a 
vector with a large DNA insert (100 to 300 kb) into Escherichia 
coli. Also called PAC. 

PI cloning system A plasmid vector system based on bacte¬ 
riophage PI that uses in vitro bacteriophage PI packaging for 


introducing a vector with a large DNA insert (80 to 100 kb) 
into Escherichia coli. 

Positional gene cloning A strategy for isolating an unknown 
disease gene. The disease gene is mapped to a chromosome 
site. A contig or genomic clone that covers the site of the dis¬ 
ease gene is tested for exons. Examination of exons and muta¬ 
tion detection assays establish which gene is the disease 
gene. 

Positive control A system of regulation of transcription that 
requires the addition of a protein activator to an activator site 
on the DNA. 

Positive selectable marker See Dominant marker selection. 
Positive selection See Dominant marker selection. 

Positive-negative selection A protocol that both selects for 
cells that carry a DNA insert integrated at a specific targeted 
chromosomal location (positive selection) and selects against 
cells that carry a DNA insert integrated at a nontargeted chro¬ 
mosomal site (negative selection). 

Posttranslational modification The specific addition of 
phosphate groups, sugars (glycosylation), or other molecules 
to a protein after it has been synthesized. 

PR protein A pathogenesis-related protein synthesized in 
some plants in response to stress. 

Preventive immunization Injection of an antigen to elicit an 
antibody response that will protect the organism against 
future infections. Also called vaccination. 

Pribnow box A sequence of six nucleotides (TATAAT) in the 
promoter region of a prokaryotic gene that is recognized and 
bound by the sigma factor of RNA polymerase to initiate 
transcription. 

Primary antibody The antibody that binds to the target mol¬ 
ecule in an ELISA or other immunological assay. 

Primary cell culture A population of growing cells that is 
started directly from a tissue or cells of an organism. 

Primary transcript Unprocessed RNA that is transcribed 
from a eukaryotic structural gene that has exons and 
introns. 

Primer A short oligonucleotide that hybridizes with a tem¬ 
plate strand and provides a 3' hydroxyl end for the initiation 
of nucleic acid synthesis. 

Primer walking A method for sequencing long (>l-kb) 
cloned pieces of DNA. The initial sequencing reaction reveals 
the sequence of the first few hundred nucleotides of the 
cloned DNA. On the basis of these data, a primer that con¬ 
tains about 20 nucleotides and is complementary to a 
sequence near the end of the sequenced DNA is synthesized 
and used for sequencing of the next few hundred nucleotides 
of the cloned DNA. This procedure is repeated until the com¬ 
plete nucleotide sequence of the cloned DNA is determined. 

Prion An infectious agent consisting of an abnormal variant 
of a brain protein that induces the normal brain protein to 
misfold. Prions have been implicated in several neurodegen- 
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erative diseases, including bovine spongiform encephalop¬ 
athy (mad cow disease) in cattle, scrapie in sheep, and 
Creutzfeldt-Jakob disease in humans. 

Probe (1) For diagnostic tests, the agent that is used to detect 
the presence of a molecule in a sample. (2) A DNA sequence 
that is used to detect the presence of a complementary 
sequence by hybridization with a nucleic acid sample. 

Probiotic A bacterium (or several bacteria) often used as a 
dietary supplement. Thought to improve the intestinal micro¬ 
bial balance and/or alleviate various intestinal ailments. 

Prodrug An inactive compound that is converted into a 
pharmacological agent by an in vivo metabolic process. 

Productivity The amount of product that is produced in a 
bioreactor within a given period of time. 

Progeny The offspring of a mating. 

Prokaryotes Organisms, usually bacteria, that have neither a 
membrane-bound nucleus enclosing their chromosomes nor 
functional organelles, such as mitochondria and chloro- 
plasts. 

Promoter A segment of DNA to which RNA polymerase 
attaches. It usually lies upstream of (5' to) a gene. A promoter 
sequence aligns the RNA polymerase so that transcription 
will initiate at a specific nucleotide. 

Pronucleus The nucleus of an egg or sperm after fertilization 
but before the egg and sperm have fused. 

Prophage A repressed or inactive state of a bacteriophage 
genome that is maintained in a bacterial host cell as part of 
the chromosomal DNA. 

Protease An enzyme that hydrolyzes peptide bond linkages 
and cleaves proteins into smaller peptides. Also called protei¬ 
nase, proteolytic enzyme. 

Protease inhibitor A protein that can form a tight complex 
with a protease and block its activity. 

Protein See Polypeptide. 

Protein chain See Polypeptide. 

Protein drug See Therapeutic agent. 

Protein microarray An array of a large number of different 
proteins for massively parallel analyses. 

Protein replacement therapy Treatment of an inherited dis¬ 
order with a structural protein that restores normal function. 

Proteolysis Enzymatic degradation of a protein. 

Proteome The complete repertoire of proteins of a cell, 
tissue, or organism. 

Proteomics The study of the structure, function, and interac¬ 
tions of the members of a proteome. 

Protocorm The precursor form of a corm, an underground 
plant propagule. 

Protoplast A bacterial, yeast, or plant cell that has had its cell 
wall removed either chemically or enzymatically. 

Protoxin A latent, nonactive precursor form of a toxin. 


Provirus A stage in the life cycle of a retrovirus in which the 
single-stranded RNA is converted into double-stranded 
DNA, which may then be integrated into the genome of a 
mammalian host cell. 

Pseudomonas A genus of common gram-negative bacteria 
that are widely distributed. Many of the soil forms produce a 
pigment that fluoresces under ultraviolet light, hence the 
descriptive term fluorescent pseudomonads. 

Pseudotype formation The packaging of the genome of one 
virus in the envelope or capsid protein of another virus. Also 
called phenotypic mixing. 

Psoriasis A chronic skin disease characterized by red patches 
covered with white scales. 

Psychrophile A microorganism that can grow at tempera¬ 
tures as low as 0 to 5°C. 

Purine Fusion of a pyrimidine and an imidazole ring, e.g., 
adenine or guanine. 

Pyrimidine A heterocyclic ring, e.g., thymine, cytosine, or 
uracil. 

Pyrogen A bacterial substance that causes fever in humans. 

Pyrophosphate Two covalently linked phosphate groups 
that are released during hydrolysis of a nucleoside triphos¬ 
phate to a nucleoside monophosphate. Pyrophosphate is 
released following the formation of a phosphodiester bond 
between nucleotides during DNA synthesis. 

Pyrosequencing A sequencing method that detects the 
release of pyrophosphate when a known nucleotide is added 
to a growing DNA strand in a template-dependent manner, 
that is, when DNA polymerase catalyzes the addition of the 
complementary nucleotide. 

Pythium ultimum A pathogenic soil fungus that causes root 
diseases in a variety of plants. 


Quencher The portion of a molecule that can quench fluores¬ 
cence. 

Query The input sequence for computer analysis, usually for 
similarity searches. 

Random amplified polymorphic DNA A diagnostic proce¬ 
dure in which chromosomal DNA (usually from plants but 
sometimes from microorganisms or animals) is characterized 
by the DNA fragments that are synthesized when PCR is 
initiated after the addition of a single primer to the reaction 
mixture. Abbreviated RAPD. 

Random mutagenesis A nondirected change of a nucleotide 
pair(s) in a DNA molecule. 

Random-primer method A protocol for labeling DNA in 
vitro. A sample of random oligonucleotides containing all 
possible combinations of nucleotide sequences is hybridized 
to a DNA probe. Then, in the presence of a DNA polymerase 
and the four deoxyribonucleotides (one of which is labeled), 
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the 3' hydroxyl ends of the hybridized oligonucleotides pro¬ 
vide initiation sites for DNA synthesis that uses the separated 
strands of the probe DNA as a template. This reaction pro¬ 
duces labeled copies of portions of the probe DNA. 

RAPD See Random amplified polymorphic DNA. 

Reading frame A series of codons that code for amino acids 
in a nucleotide sequence. 

Read-through Transcription or translation that proceeds 
beyond the normal stopping point because the transcription 
or translation termination signal of a gene is absent or 
mutated. 

RecA A protein, found in most bacteria, that is essential for 
DNA repair and DNA recombination. 

Recessive gene An allele that does not demonstrably con¬ 
tribute to the phenotype in a heterozygote. 

Recognition site See Restriction site. 

Recombinant An individual with two or more linked genes 
that are a consequence of one or more crossover events. 

Recombinant DNA technology See Gene cloning. 

Recombinant protein A protein whose amino acid sequence 
is encoded by a cloned gene. 

Recombinant toxin A single multifunctional toxic protein 
that has been created by combining the coding regions of 
various genes. 

Recombinant vaccine See Subunit vaccine. 

Recombination See Crossover. 

Reducing end The end of a cellulose strand that can act as a 
reducing agent; it typically contains an aldehyde moiety. 

Refugium A small tract of land where a crop that is other¬ 
wise treated with the microbial insecticide Bacillus thuringi- 
ensis is left untreated. 

Regulatory protein A protein that plays a role in either 
turning on or turning off transcription. 

Remedial See Therapeutic. 

Renaturation The reassociation of two nucleic acid strands 
after denaturation. 

Replacement therapy The administration of metabolites, 
cofactors, or hormones that are deficient as the result of a 
genetic disease. 

Replica plating The transfer of cells from bacterial colonies 
on one petri plate to another petri plate; the locations of the 
colonies that grow on the second plate correspond to those on 
the original (master) petri plate. 

Replicatable biological unit Any biological entity that is 
capable of being reproduced. 

Replication The process of DNA synthesis. 

Replicative form The molecular configuration of viral 
nucleic acid that is the template for replication in a host cell. 
Also called RE 


Reporter gene A gene that encodes a product that can readily 
be assayed. For example, reporter genes are used to deter¬ 
mine whether a particular DNA construct has been success¬ 
fully introduced into a cell, organ, or tissue. 

Repression Inhibition of transcription by preventing RNA 
polymerase from binding to the transcription initiation site; a 
repressed gene is "turned off." 

Repressor A protein that binds to the operator or promoter 
region of a gene and prevents transcription by blocking the 
binding of RNA polymerase. 

Repulsion The phase state in which a dominant version and 
a recessive version of two different genes occur on the same 
chromosome. Also called trans configuration. See also 
Coupling. 

Response element A sequence of deoxyribonucleotides of a 
gene that acts as a binding site for a protein (transcription 
factor) that regulates transcription. Also called initiator ele¬ 
ment, signal region. 

Restenosis Recurring narrowing of a biological opening, 
tube, or canal. 

Restriction endonuclease (type II) An enzyme that recog¬ 
nizes a specific duplex DNA sequence and cleaves phospho- 
diester bonds on both strands between definite nucleotides. 

Restriction map The linear array of restriction endonuclease 
sites on a DNA molecule. 

Restriction site The sequence of nucleotide pairs in duplex 
DNA that is recognized by a type II restriction endonuclease. 
Sometimes called restriction enzyme site, restriction endonu¬ 
clease site, or recognition site. 

Retentate The liquid retained after passages of a solution 
across an ultrafiltration membrane. 

Retrotransposon A genetic element that reproduces by first 
synthesizing an RNA intermediate, which is then copied back 
into DNA by reverse transcriptase before inserting randomly 
into a genome. 

Retrovirus A class of eukaryotic RNA viruses that can form 
double-stranded DNA copies of their genomes; the double- 
stranded forms can integrate into chromosomal sites of an 
infected cell. 

Reverse-phase microarray An array of multiprotein com¬ 
plexes of cell lysates or tissue specimens. 

Reverse transcriptase An RNA-dependent DNA polymerase 
that uses an RNA molecule as a template for the synthesis of 
a complementary DNA strand. 

Reverse transcription-polymerase chain reaction A two- 
step protocol for synthesizing cDNA molecules. First, cDNA 
strands are synthesized in vitro by reverse transcriptase with 
oligo(dT) as a primer and mRNA as the template. Second, a 
specific cDNA strand is amplified by the polymerase chain 
reaction (PCR), with one primer directed to a sequence of the 
first cDNA strand and the other to a sequence of the comple¬ 
mentary cDNA strand (second strand) that is synthesized 
during the first PCR cycle. Also called RT-PCR. 
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Reversible chain terminator A nucleotide that has a blocking 
group at the 3' carbon of the deoxyribose sugar to prevent 
subsequent addition of nucleotides to a growing DNA strand. 
In DNA sequencing by single-nucleotide addition, the modi¬ 
fied nucleotides are used to ensure that the DNA strand is 
extended by only a single nucleotide during each cycle. After 
the incorporated nucleotide is identified by detection of a 
unique fluorophore, the blocking group is removed to restore 
the 3' hydroxyl group for the next cycle of nucleotide addi¬ 
tion. 

Rhizobacterium A microorganism whose natural habitat is 
near, on, or in plant roots. 

Rhizof iltration The use of plant roots to remove metals from 
contaminated effluents. 

Rhizosecretion Secretion of molecules from the roots of 
plants. 

Rhizosphere The zone in the immediate vicinity of growing 
plant roots. 

Ribonuclease An enzyme that cleaves RNA. Also called 
RNase. 

Ribonucleic acid See RNA. 

Ribose The five-carbon sugar component of RNA. 

Ribosomal RNA The RNA molecules that form part of the 
large and small ribosomal subunits. Also called rRNA. 

Ribosome The subcellular structure that contains both RNA 
and protein molecules and mediates the translation of mRNA 
into protein. Ribosomes contain both large and small sub¬ 
units. 

Ribosome-binding site A sequence of nucleotides near the 5' 
phosphate end of a bacterial mRNA that facilitates the 
binding of the mRNA to the small ribosomal subunit. Also 
called Shine-Dalgarno sequence. 

Ribozyme An RNA molecule that has catalytic activity. 

Ribulose bisphosphate carboxylase The most abundant 
enzyme in the world, found in all green plants and respon¬ 
sible for the fixation of carbon dioxide in photosynthesis. 
Sometimes called RuBisCO. 

RNA Ribonucleic acid; a polynucleotide that has ribose as its 
pentose sugar and uracil as one of its pyrimidines. 

RNA interference A method to inhibit expression of a target 
gene. A small RNA binds to a complementary region of the 
mRNA of the target gene and prevents its translation into 
protein. 

RNA polymerase An enzyme that links an incoming ribo¬ 
nucleotide, which is determined by complementarity to a 
base in a template DNA strand, with a phosphodiester bond 
to the 3' hydroxyl group of the last incorporated ribonucle¬ 
otide of the growing RNA strand during transcription. 

RNase See Ribonuclease. 

Rolling circle A mode of DNA replication that produces 
concatemeric duplex DNA. 


Root nodule A small round mass of cells that is located on 
the roots of plants and contains nitrogen-fixing bacteria. 

rRNA See Ribosomal RNA. 

RT-PCR See Reverse transcription-polymerase chain reac¬ 
tion. 

Rumen A compartment of the stomach of cows and other 
ruminants where ingested food is initially digested. 

Saccharifaction Hydrolysis of polysaccharides, after lique¬ 
faction, by glucoamylase to maltose and glucose. 

Scaffolding Assembly of sequence contigs in the correct order 
and orientation to reconstruct the sequence of a genome. 

Scale-up Conversion of a process, such as fermentation of a 
microorganism, from a small scale to a larger scale. 

SCP See Single-cell protein. 

SDS See Sodium dodecyl sulfate. 

SDS-PAGE See Sodium dodecyl sulfate-polyacrylamide gel 
electrophoresis. 

Secondary antibody In an ELISA or other immunological 
assay system, the antibody that binds to the primary anti¬ 
body. The secondary antibody is often conjugated with an 
enzyme, such as alkaline phosphatase. 

Secondary metabolite A compound that is not necessary for 
growth or maintenance of cellular functions but is synthe¬ 
sized, generally for the protection of a cell or microorganism, 
during the stationary phase of the growth cycle. 

Secrete See Export. 

Secretion The passage of a molecule from the inside of a cell 
through a membrane into the periplasmic space or the extra¬ 
cellular medium. 

Secretion complex A complex of proteins in the cytoplasmic 
membrane of bacterial cells for transporting proteins across 
the membrane. 

Secretory proteins Any protein that is secreted by a cell. 

Selectable Having a gene product that, when present, 
enables a researcher to identify and preferentially propagate 
a particular cell type. 

Selection (1) A system for either isolating or identifying spe¬ 
cific organisms in a mixed culture. (2) Survival of a more 
reproductively fit organism. 

Selective breeding The deliberate mating of plants or ani¬ 
mals with selected traits to develop offspring with desired 
characteristics. Also known as conventional breeding or tra¬ 
ditional breeding. 

Self-incompatibility In plants, the inability of the pollen to 
fertilize ovules (female gametes) of the same plant. Also 
called self-sterility. 

Self-replicating elements Extrachromosomal DNA elements 
that have origins of replication for the initiation of their own 
DNA synthesis. 
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Self-sterility See Self-incompatibility. 

Senescence The last stage in the postembryonic develop¬ 
ment of multicellular organisms, during which loss of func¬ 
tions and degradation of biological components occur. Also 
called biological aging. 

Sensitivity The ratio of all true-positive test results to all 
positive test results, i.e., true positives plus false negatives. 

Sequence-tagged site A short (200- to 500-bp) DNA sequence 
that occurs once in the genome and is identified by PCR 
amplification. Abbreviated STS. 

Sequence-tagged-site content mapping Determination of 
shared sites among clones of a library by using markers that 
are based on unique polymerase chain reaction primers. This 
facilitates the assembly of a contig. 

Serial analysis of gene expression A technique that identi¬ 
fies and quantifies short sequence tags to measure the expres¬ 
sion levels of all of the genes that are transcribed in a cell, 
tissue, or organism under a set of conditions. Also called 
SAGE. 

Serotype Classification of an organism or protein on the 
basis of its interaction with antibodies. 

Shear The sliding of one layer across another, with deforma¬ 
tion and fracturing in the direction parallel to the movement. 
This term usually refers to the forces that cells are subjected 
to in a bioreactor or a mechanical device used for cell 
breakage. 

Shine-Dalgarno sequence See Ribosome-binding site. 

Short template A DNA strand that is synthesized during the 
polymerase chain reaction and has a primer sequence at one 
end and a sequence complementary to the second primer at 
the other end. 

Shotgun cloning Construction of a library of small, overlap¬ 
ping fragments of genomic DNA to sequence the fragments. 
The overlapping sequences are then assembled to obtain the 
sequence of the entire genome. 

Shuttle vector A plasmid-cloning vehicle, usually a plasmid, 
that can replicate in two different organisms because it carries 
two different origins of replication. Also called bifunctional 
vector. 

Siderophore A low-molecular-weight substance that binds 
very tightly to iron. Siderophores are synthesized by a variety 
of soil microorganisms and plants to ensure that the organ¬ 
isms can obtain sufficient amounts of iron from the environ¬ 
ment. 

Sigma factor An accessory bacterial protein(s) that directs 
the binding of RNA polymerase to specific promoters. 

Signal peptide See Signal sequence. 

Signal recognition complex A group of proteins that binds 
to the signal peptide of a newly synthesized protein and tar¬ 
gets the protein for secretion across the cytoplasmic mem¬ 
brane through the secretion complex. 

Signal region See Response element. 


Signal sequence A segment of about 15 to 30 amino acids at 
the N terminus of a protein that enables the protein to be 
secreted (pass through a cell membrane). The signal sequence 
is removed as the protein is secreted. Also called signal pep¬ 
tide, leader peptide. 

Signal-to-noise ratio The ratio of the extent of the response 
to an assay when the target entity is present (signal) in a 
sample to the extent when it is absent (noise) from the 
sample. 

Silage Cattle feed that has been allowed to ferment. 
Similarity Degree of relationship between two sequences. 

Simplicity For diagnostic tests, the ease with which an assay 
can be implemented. 

Single-cell protein A dried mass of a pure sample of a pro¬ 
tein-rich microorganism, which may be used either as feed 
(for animals) or as food (for humans). Abbreviated SCR 

Single-site mutation A change in one base pair in DNA. 
Also called point mutation. 

Single-strand conformation analysis A mutation detection 
assay that is based on the conformation of single strands of 
DNA. If there is a nucleotide difference between the DNA 
molecules from two different sources, then following dena- 
turation and gel electrophoresis, the locations of the single 
strands in the two lanes of the gel will be different. Also 
called single-strand conformational analysis, single-strand 
conformation(al) polymorphism (SSCP), SSCA. 

Site-specific mutagenesis A technique to change one or 
more specific nucleotides in a cloned gene in order to create 
an altered form of a protein with a specific amino acid 
change(s). Also called oligonucleotide-directed mutagenesis. 

Six-cutter A type II restriction endonuclease that binds and 
cleaves DNA at sites that contain six nucleotide pairs. 

Size markers A set of macromolecules with known molec¬ 
ular masses that are used to calculate the molecular masses of 
electrophoretically fractionated macromolecules. 

Small ribosomal subunit The smaller component of a ribo¬ 
some. 

Sodium dodecyl sulfate An anionic detergent that denatures 
proteins. 

Sodium dodecyl sulfate-polyacrylamide gel electropho¬ 
resis A technique for separating and then visualizing protein 
samples. Also called SDS-PAGE. 

Somatic cell Any cell of a multicellular organism that does 
not produce gametes. 

Somatic cell gene therapy The delivery of a gene(s) to a 
tissue other than reproductive cells of an individual with the 
aim of correcting a genetic defect. 

Somatic cell hybrid panel A set of derived chromosome- 
specific hybrid cell lines that each carry a different portion of 
a particular chromosome. The members of such a panel have 
chromosomal deletions and, in some cases, carry translocated 
chromosomes that retain a segment of a particular chromo- 
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some. Ideally, the retained portions of the cell lines of a panel 
cover the entire chromosome. Also called somatic cell hybrid 
mapping panel, somatic cell deletion panel. 

SI nuclease An enzyme that specifically degrades single- 
stranded DNA. 

Sonication Disruption of cells or DNA molecules by high- 
frequency sound waves. Also called ultrasonication. 

Source DNA The DNA from an organism that contains a 
target gene; this DNA is used as starting material in a cloning 
experiment. 

Source organism An organism (e.g., a bacterium, plant, or 
animal) from which DNA is purified and used in a cloning 
experiment. 

Southern blotting A technique for transferring denatured 
DNA molecules that have been separated electrophoretically 
from a gel to a matrix (such as a nitrocellulose or nylon mem¬ 
brane) on which a hybridization assay can be performed. 

Sparger A device that introduces air into a bioreactor in the 
form of separate, fine streams. 

Specificity The ratio of all true-negative test results to all 
negative test results, i.e., true negatives plus false positives. 

Splice site The nucleotides at (1) the end of an exon and the 
beginning of an intron and (2) the end of an intron and the 
beginning of the next exon that are required for the joining of 
two exons and the removal of an intron during the processing 
of a primary transcript into a functional mRNA. 

Splice site mutation Loss or gain of a functional splice site 
that affects the proper removal of introns during mRNA pro¬ 
cessing. 

Spore A small, protected reproductive form of a microor¬ 
ganism, often produced when nutrient levels are low. 

Sporozoite A cell of a malaria parasite that develops in the 
mosquito's salivary glands, leaves the mosquito during a 
blood meal, and enters the liver, where it multiplies. 

Sporulation Formation of spores or resting structures, usu¬ 
ally after the near depletion of nutrients from the growth 
medium, by some bacteria or fungi. 

SSCA See Single-strand conformation analysis. 

Staggered cuts Symmetrically cleaved phosphodiester bonds 
that lie on both strands of duplex DNA but are not opposite 
one another. 

Start codon See Initiation codon. 

Steady state In a continuous fermentation process, the con¬ 
dition where the number of cells that are removed with the 
outflow is exactly balanced by the number of newly divided 

cells. 

Stem cell A precursor cell that undergoes division and gives 
rise to lineages of differentiated cells. 

Stenosis Narrowing of a biological opening, tube, or canal. 
Sticky ends See Cohesive ends. 


Stirred-tank fermenter A growth vessel in which cells or 
microorganisms are mixed by mechanically driven impel¬ 
lers. 

Strain A microorganism or multicellular organism that is a 
genetic variant of a standard parental stock. 

Strand A linear series of nucleotides that are linked to each 
other by phosphodiester bonds. 

Streptavidin A protein from Streptomyces spp. that has a very 
high affinity for biotin and is often used for purification or 
detection of biotin-tagged molecules. 

Streptokinase A bacterial enzyme that catalyzes the conver¬ 
sion of plasminogen to plasmin, thereby helping to dissolve 
blood clots. 

Stress ethylene Ethylene that is synthesized in response to 
some form of environmental stress. 

Structural gene A sequence of DNA that encodes a protein. 
STS See Sequence-tagged site. 

Subcloning Splicing part of a cloned DNA molecule into a 
different cloning vector. 

Subcutaneous Lying beneath the dermis layer of the skin. 

Subspecies A population(s) of organisms sharing certain 
characteristics that are not present in other populations of the 
same species. 

Substantial equivalence A term used by many national 
regulatory agencies to describe genetically engineered prod¬ 
ucts that are similar in composition and safety to their non- 
genetically engineered counterparts. 

Substitutive therapy Treatment of an inherited disorder 
with a cofactor that restores enzyme function. 

Substrate (1) A compound that is altered by an enzyme. (2) 
A food source for growing cells or microorganisms. 

Substrate-induced gene expression A method used to iden¬ 
tify catabolic genes that are expressed when a particular 
substrate is present. Also called SIGEX. 

Subtilisin A proteolytic enzyme usually found in Bacillus 
subtilis. 

Subunit vaccine An immunogenic protein(s) either purified 
from the disease-causing organism or produced from a 
cloned gene. 

Sucrose density gradient centrifugation A procedure used 
to fractionate mRNAs or DNA fragments according to size. 

Suicide gene A plasmid-borne, inducible sequence that pro¬ 
duces a protein that directly or indirectly kills the host cell. 

Superbug Jargon for the bacterial strain of Pseudomonas 
developed by A. Chakrabarty, who combined hydrocarbon¬ 
degrading genes carried on different plasmids into one 
organism. Although this genetically engineered microor¬ 
ganism is neither "super" nor a "bug," it is a landmark 
example, because it showed how genetically modified micro¬ 
bial strains could be used in a novel way and because it was 
the basis for the precedent-setting legal decision that, in the 
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United States, genetically engineered organisms are patent- 
able. 

Suppressor tRNA An abnormal tRNA that inserts an amino 
acid where a mutant mRNA specifies a stop codon in the 
middle of the coding portion of a gene. Insertion of this 
amino acid allows a normal-size rather than a shortened pro¬ 
tein to be synthesized. 

Symbiosis A close biological relationship between two 
organisms in which neither organism is extremely harmful to 
the other. In some cases, the relationship is mutually benefi¬ 
cial. 

Syndrome A constellation of features that together make up 
the symptoms of a disorder or disease. 

Systemic acquired resistance Resistance, in plants, to patho¬ 
genic agents that occur following an initial exposure to the 
same or another pathogenic agent. This resistance extends to 
plant tissues that are far from the site of the initial infection 
and may last for weeks to months. 

Systemic immunity Immunity that affects the body as a 
whole; all-body immunity. 

T A thymine residue in DNA. 

T cells Lymphocytes that pass through the thymus gland 
during maturation. Different kinds of T cells play important 
roles in the immune response. 

Tag See Label. 

Tailing The in vitro addition of the same nucleotide by the 
enzyme terminal transferase to the 3' hydroxyl ends of a 
duplex DNA molecule. Also called homopolymeric tailing. 

Tandem array Usually, a DNA molecule that contains two or 
more identical nucleotide sequences in series. 

Tandem mass spectrometry Initial mass analysis of ions 
(precursor ions) followed by a second mass analysis of the 
daughter ions of a selected precursor ion. Also called MS/ 
MS, tandem MS. 

Target For diagnostic tests, the molecule or nucleic acid 
sequence that is being sought in a sample. 

Target gene A descriptive term for a gene that is to be either 
cloned or specifically mutated. 

Targeting vector A cloning vector carrying a DNA sequence 
capable of participating in a crossover event at a specified 
chromosomal location in the host cell. 

TATA box The DNA sequence to which RNA polymerase 
binds and that lies upstream from the site of initiation of 
transcription and ensures that transcription starts at a speci¬ 
fied nucleotide. Also called a Pribnow box in prokaryotes and 
a Hogness box in eukaryotes, after the researchers who dis¬ 
covered the function of the TATA box in prokaryotes and 
eukaryotes, respectively. 

T-DNA The segment of a Ti plasmid that is transferred and 
integrated into chromosomal sites in the nuclei of plant 
cells. 


Telomere The defined end of a chromosome containing spe¬ 
cific DNA sequences. 

Temperature-sensitive protein A protein that is functional at 
one temperature but loses function at another (usually 
higher) temperature. 

Template strand The polynucleotide strand that a polymerase 
uses for determining the sequence of nucleotides during the 
synthesis of a new nucleic acid strand. 

Termination The cessation of the biosynthesis of a polymeric 
macromolecule. 

Termination codon A naturally occurring codon that does 
not base pair with the anticodon of any tRNA. Generally, the 
three codons in this class (UAA, UAG, and UGA) are used to 
terminate translation, although in some rare instances one of 
these codons does code for an amino acid. Also called trans¬ 
lational stop signal. 

Termination factor A protein that enters the A site of a ribo¬ 
some when a stop codon in the mRNA is present and termi¬ 
nates protein synthesis by stimulating cleavage of the 
polypeptide from the tRNA in the P site. Also known as a 
release factor. 

Terminator A sequence of DNA at the 3' end of a gene that 
stops transcription. Also called transcription terminator. 

Tetanus An infectious disease marked by spasms of volun¬ 
tary muscles and caused by the toxin from the bacterium 
Clostridium tetani. 

T4 DNA ligase An enzyme from bacteriophage T4-infected 
cells that catalyzes the joining of duplex DNA molecules and 
repairs nicks in DNA molecules. The enzyme joins a 5' phos¬ 
phate group to a 3' hydroxyl group. 

T4 DNA polymerase end labeling A process in which DNA 
that has been cut with a restriction endonuclease(s) is mixed 
with T4 DNA polymerase and one labeled deoxyribonucle- 
otide. The 3' exonuclease activity of the T4 DNA polymerase 
removes deoxyribonucleotides from the 3' ends of the DNA 
fragments. Immediately after a deoxyribonucleotide that is 
the same as the deoxyribonucleotide in the reaction mixture 
is cleaved off, the T4 DNA polymerase activity incorporates a 
labeled deoxyribonucleotide from the reaction mixture. No 
further incorporation of deoxyribonucleotides occurs because 
there is only one kind of deoxyribonucleotide in the reaction 
mixture. 

Thaumatin A plant protein that has a sweet taste. It is also 
synthesized in some plants in response to infection by patho¬ 
gens. 

Therapeutic Referring to treatment of a disease. 

Therapeutic agent A compound that is used for the treat¬ 
ment of a disease and for improving the well-being of an 
organism. Also called pharmaceutical agent, drug, protein 
drug. 

Thermophile A microorganism that grows optimally at high 
temperatures, usually above 50°C. Some thermophiles can 
grow at temperatures of 90 to 100°C. 
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Thermosensitivity Loss of activity of a protein at high tem¬ 
perature. 

Thermostability Retention of activity at high temperature. 
Thioredoxin A small protein that acts as an electron carrier. 

3' extension A short, single-stranded nucleotide sequence on 
the 3' hydroxyl end of a double-stranded DNA molecule. 
Also called 3' protruding end, 3' sticky end, 3' overhang. 

3' hydroxyl end The hydroxyl group that is attached to the 3' 
carbon atom of the sugar (ribose or deoxyribose) of the ter¬ 
minal nucleotide of a nucleic acid molecule. 

Thrombin A blood protein that plays a role in blood clotting. 
Thrombus A blood clot. 

Thymidylate synthase An enzyme that catalyzes the meth- 
ylation of the uracil moiety in dUMP to convert it to dTMP. 

Thymine One of the organic bases found in DNA. 

Ti plasmid A large extrachromosomal element that is found 
in strains of Agrobacterium and is responsible for crown gall 
formation. 

Tissue plasminogen activator A protein involved in dis¬ 
solving blood clots. Abbreviated tPA. 

Totipotent Generally, the state in which a cell is able to 
respond to any one of a number of different stimuli and, sub¬ 
sequently, to develop into any one of a number of differenti¬ 
ated cell types. Also called pluripotent. 

Toxoid A toxin that has been treated to destroy its toxicity 
but is left capable of inducing antibodies. 

tPA See Tissue plasminogen activator. 

Tracking dye A low-molecular-weight, visible, colored com¬ 
pound that moves with the ion front during gel electropho¬ 
resis. 

Transcribed triplet A set of three contiguous nucleotides of 
the transcribed DNA strand of the coding region of a struc¬ 
tural gene that determines a codon in the mRNA. 

Transcript An RNA molecule that has been synthesized from 
a specific DNA template. 

Transcription The process of RNA synthesis that is catalyzed 
by RNA polymerase; it uses a DNA strand as a template. 

Transcription factor A protein that facilitates RNA synthesis 
by binding to a specific DNA sequence or another transcrip¬ 
tion factor that is bound to a specific DNA sequence. 

Transcription mapping Assigning gene transcripts, in the 
form of cDNA clones or expressed sequence tags, to specific 
chromosome regions by fluorescence in situ hybridization, 
hybridization, polymerase chain reaction, analysis of somatic 
cell hybrid mapping panels, or other strategies. Also called 
transcript mapping, transcriptional mapping. 

Transcriptome The complete repertoire of RNA molecules of 
a cell, tissue, or organism. 


Transduction The transfer of nonviral DNA by a virus to a 
cell. 

Transfection The transfer of DNA to a eukaryotic cell. 

Transfer RNA The RNA molecules that decode the sequence 
information contained in an mRNA molecule during the 
translation process. Also called tRNA. 

Transformation (1) The uptake and establishment of DNA in 
a bacterium or yeast cell in which the introduced DNA often 
changes the phenotype of the recipient organism. (2) 
Conversion, by various means, of animal cells in tissue cul¬ 
ture from controlled to uncontrolled cell growth. 

Transformation efficiency The number of cells that take up 
foreign DNA as a function of the amount of added DNA; 
expressed as transformants per microgram of added DNA. 

Transformation frequency The fraction of a cell population 
that takes up foreign DNA; expressed as the number of trans¬ 
formed cells divided by the total number of cells in a popula¬ 
tion. 

Transgene A gene from one source that has been incorpo¬ 
rated into the genome of another organism. Often refers to a 
gene that has been introduced into a multicellular 
organism. 

Transgenesis The introduction of a gene(s) into animal or 
plant cells that leads to the transmission of the input gene 
(transgene) to successive generations. 

Transgenic animal A fertile animal that carries an introduced 
gene(s) in its germ line. 

Transgenic plant A fertile plant that carries an introduced 
gene(s) in its germ line. 

Transient Of short duration. 

Translation The process of protein (polypeptide) synthesis 
in which the amino acid sequence of a protein is determined 
by mRNA mediated by tRNA molecules and carried out on 
ribosomes. 

Translational initiation signal See Initiation codon. 
Translational start codon See Initiation codon. 

Translational stop signal See Termination codon. 

Translocation (1) The movement of peptidyl-tRNA and 
mRNA from the aminoacyl site to the peptidyl site on the 
ribosome during the elongation phase of translation; this 
movement opens the aminoacyl site for the next codon. (2) 
The transfer of chromosome material from one chromosome 
to another. (3) The movement of compounds through a 
plant. 

Transposable element See Transposon. 

Transposase An enzyme that is encoded by a transposon 
gene and that facilitates the insertion of the transposon into a 
new chromosomal site and excision from a site. 
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Transposon A DNA sequence (mobile genetic element) that 
can insert randomly into a chromosome, exit the site, and 
relocate at another chromosomal site. For example, Tn5 is a 
bacterial transposon that carries the genes for resistance to 
the antibiotics neomycin and kanamycin and the genetic 
information for its insertion and excision. Also called trans- 
posable element. 

Trichloroethylene An organic compound, used as a solvent 
and degreasing agent, that often persists in the environment. 

Tripartite mating A process in which conjugation is used to 
transfer a plasmid vector to a target cell when the plasmid 
vector is not self-mobilizable. When (1) cells that have a 
plasmid with conjugative and mobilizing functions are mixed 
with (2) cells that carry the plasmid vector and (3) target cells, 
mobilizing plasmids enter the cells with the plasmid vector 
and mobilize the plasmid vector to enter the target cells. 
Following tripartite mating, the target cells with the plasmid 
vector are separated from the other cell types in the mixture 
by various selection procedures. 

tRNA See Transfer RNA. 

True negative A test result that does not indicate the pres¬ 
ence of a target when it is not in a sample. 

True positive A test result that always recognizes a target 
when it is present in a sample. 

Two-dimensional polyacrylamide gel electrophoresis A 

technique to separate different proteins in a complex mixture 
first based on differences in their net charges (first dimen¬ 
sion) and then on differences in their molecular weights 
(second dimension). 

Two-hybrid system An assay for identifying pairwise pro¬ 
tein-protein interactions. 

2pm plasmid A naturally occurring, double-stranded, cir¬ 
cular DNA plasmid (6,318 bp) found in the nuclei of 
Saccharomyces cerevisiae. Many yeast plasmid vectors are 
derived from the 2pm plasmid. Also called 2pm circle, 2p 
plasmid, 2-micron plasmid. 

U A uracil residue in RNA. 

Upstream (1) In molecular biology, the stretch of DNA base 
pairs that lie in the 5' direction from the site of initiation of 
transcription. Usually, the first transcribed base is designated 
+1 and the upstream nucleotides are indicated with minus 
signs, e.g., -1 and -10. Also, to the 5' side of a particular gene 
or sequence of nucleotides. (2) In chemical engineering, those 
phases of a manufacturing process that precede the biotrans¬ 
formation step; the preparation of raw materials for a fermen¬ 
tation process. Also called upstream processing. 

Upstream processing See Upstream. 

Uracil One of the organic bases found in RNA. 

Vaccination See Preventive immunization. 


Variable domains Regions of antibody chains that have dif¬ 
ferent amino acid sequences in different antibody molecules. 
These regions are responsible for the antigen-binding speci¬ 
ficity of the antibody molecule. 

Variant An organism that is genetically different from the 
wild-type organism. Also called mutant. 

Vascularization The formation of blood vessels. 

Vector See Cloning vector. 

Vegetative Referring to the normal growth cycle of a micro¬ 
organism. 

Vehicle See Cloning vector. 

vir genes A set of genes on a Ti plasmid that prepare the 
T-DNA segment for transfer into a plant cell. 

Virion An infectious virus particle. 

Viroid A circular single-stranded RNA that forms a highly 
base-paired (to itself) double-stranded-like structure and acts 
as a disease-causing agent. Viroids do not encode any pro¬ 
teins. 

Virulence The degree of pathogenicity of an organism. 

V max The maximal rate of an enzyme-catalyzed reaction. V max 
is the product of E 0 (the total amount of enzyme) and the 
value of k at (the catalytic rate constant). 

Washout The loss of the slower-growing microorganism 
when two organisms are being grown together. 

Western blotting Transfer of protein from a gel to a mem¬ 
brane. 

Whey A liquid by-product of cheese making, containing 
mostly lactose and some milk proteins and minerals. 

Wild type A genetic term that denotes the most commonly 
observed phenotype, or the normal state, in contrast to a 
mutant condition. 

X linkage The presence of a gene on the X chromosome. 

X-ray diffraction A technique to determine the three-dimen¬ 
sional structure of a molecule based on the diffraction pattern 
of X rays by the atoms of the molecule. 

Xenobiotic A chemical compound that is not produced by 
living organisms; a manufactured chemical compound. 

Xenogeneic From a different species or individual (an attri¬ 
bute of cells or tissue). Also called xenogenic. 

Xenomouse A transgenic mouse that has been engineered to 
produce a full range of human antibodies against every 
antigen. 

Xenotransplantation A procedure for transferring cells, tis¬ 
sues, or organs from one species to another species. 

Xylem A tissue that transports water and dissolved minerals 
in plants. Xylem contributes significantly to the mechanical 
strength of the plant. 
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Xylose A five-carbon sugar that is a major constituent of 
hemicellulose. 

YAC See Yeast artificial chromosome. 

Yeast A single-celled fungus. 

Yeast artificial chromosome A yeast-based vector system for 
cloning large (>100-kb) DNA inserts. Abbreviated YAC. 


Yeast episomal vector A cloning vector for the yeast 
Saccharomyces cerevisiae that uses the 2pm plasmid origin of 
replication and is maintained as an extrachromosomal nuclear 
DNA molecule. 

Zinc finger proteins Sequence-specific DNA-binding pro¬ 
teins that contain domains that bind Zn 2+ . 
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A 

aadA gene, in marker gene removal, 754 
ABO blood group antigens, glycosidase 
alterations in, 394-395 
Abscisic acid-inducible promoter, in salt- 
resistant plant production, 793 
Acceleration phase, of microbial growth, 
687-689 

Acetaldehyde, as biodegradation product, 
552 

Acetate 

in actinorhodine production, 528 
in antibiotic production, 529-532 
as fermentation inhibitor, 694, 698-701 
Acetoin, synthesis of, 699-700 
2-Acetolactate, in valine production, 518- 
519 

Acetolactate synthase, in fermentation, 699 
Acetosyringone, in crown gall disease, 727 
Acetyl-coenzyme A 

as biodegradation product, 552 
in polyhydroxyalkanoate production, 
542-545, 830 

Acetylene, ethylene formation from, 621 
D-N-Acetylglucosamine, in hyaluronic acid 
production, 545-547 

Achromobacter piechaudii, in flooding protec¬ 
tion, 605 

AcMNPV, see Autographa californicn multiple 
nuclear polyhedrosis virus 
Acremonium chrysogenum 

in antibiotic production, 533-534 
in fermentation, 694 
Actin, DNase I binding to, 390 
Actinorhodine, production of, 528-529 
Activation domain, in protein-protein 
interaction, 182-183 
Adaptors 

in cyclic array sequencing, 137 
in DNA cloning, 105 
in SAGE method, 161 
Adenine (A) 
derivatization of, 99, 103 
in DNA structure, 15-17 


in RNA structure, 22 

Adeno-associated virus, 277-278, 447-448 
Adenosine deaminase deficiency, gene ther¬ 
apy for, 446 

S-Adenosylmethionine, 1-aminocyclopro- 
pane-carboxylate deaminase synthe¬ 
sis from, 605 

S-Adenosylmethionine synthase, in ethyl¬ 
ene regulation, 798-799 
Adenovirus, as vector, 273, 491 
Adenylate kinase, modification of, 820-821 
Adhesion proteins, in dental caries vaccine, 
479^80 

Adhesive proteins, commercial production 
of, 539-541 

Adjuvants, for vaccines, 476^77 
ADP-glucose pyrophosphorylase, in starch 
synthesis, 818-819 

Aequorea victoria, fluorescent protein of, 341 
Aerosol immunization, 491 
Affinity tags, 246 

African green monkey kidney (COS) cells, 
272, 275 

AGp-Gal protein, in fermentation, 708 
Agitation 

in bioreactors, 701-705 
in fermentation, 693 
Agriculture 

biodiversity in, 932-933 
economic issues in, 8 
lignocellulosics from, 581-583 
Agrobacterium 

in filamentous fungal systems, 260 
in iron production, 834 
Agrobacterium radiobacter, antibiotics of, 614 
Agrobacterium tumefaciens 
antibiotics of, 614 
in starch production, 818 
Ti plasmid of 

infection with, 726-730 
for insect resistance, 764-766 
vector systems derived from, 730-735 
Agrocins, for phytopathogen control, 612- 
614 


Agropine, in crown gall disease, 729-730 
Airlift bioreactors, 702, 705-708 
Alanine, 21 

Albumin gene, in lupine, 803-804 
Albumin-growth hormone hybrid mole¬ 
cule, 387-388 

Albumin-interferon hybrid molecule, 383 
Albutropin, 387-388 
Alcaligenes eutrophus, in polyhydroxyal¬ 
kanoate production, 542-545, 830 
Alcaligenes latus, in polyhydroxyalkanoate 
production, 542-545 
Alcohol 

in low-ethanol wines, 574-575 
production of 

lignin modification for, 835-836 
Saccharomycopsis fibuligera in, 588 
in starch degradation, 570-576, 581 
yeast tolerance in, 578-580 
Zymomonas mobilis in, 589-595 
Alcohol oxidase, in Picliia pastoris system, 
255 

Alfalfa, genetically modified 
lignin modification in, 835-836 
regulation of, 907, 909 
Alginate encapsulation, for phytoremedia¬ 
tion, 643 

Alginate lyase, production of, 390-392 
Alkali treatment, for microbial cell disrup¬ 
tion, 714-715 
Alkaline phosphatase 
in cloning, 58 

for DNA hybridization, 350-352 
secretion of, 232 
Allele(s) 

for ancestry determination, 361-362 
of CFTR gene, 366-367 
Allele-specific oligonucleotide dot blots, for 
cystic fibrosis, 366-367 
Allergens, in genetically modified foods, 
927-930 

"All-fish" construct, in transgenesis, 888 
Alphaherpesvirus, mouse model for, 866 
Alternate splicing, 24—25 
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Alzheimer disease 

protein microarray analysis for, 174-175 
transgenic mouse model for, 863-865 
Amber suppressor tRNA, 304 
American trypanosomiasis, diagnosis of, 
349-350 
Amino acid(s) 
codons for, 30-32 

commercial production of, 514-519 
directed mutations of, see Directed muta¬ 
genesis 

in plants, modification of, 805-808 
posttranslational modification of, 241- 
242 

for protein stabilization, 215-216 
as protein subunits, 21 
translational errors with, 234 
unusual, for directed mutagenesis, 304- 
305 

D-Amino acid oxidase, in marker removal, 
753 

p-Aminobenzoic acid, in folate synthesis, 
813 

7-Aminocephalosporanic acid, 533-534 
1-Aminocyclopropane-carboxylate deami¬ 
nase 

ethylene and, 640, 735, 798-799 
in plant growth, 601-603, 605, 618 
7-Aminodeacetoxyce-phalosporanic acid, 
533-534 

Aminopterin, for monoclonal antibody 
preparation, 338-339 
Ammonia 

as fermentation inhibitor, 694 
nitrogen conversion to, 619, 621 
Ampicillin, in plasmid transformation, 62, 
65-66 

Ampicillin resistance gene, 201 

in baculovirus-insect cell expression sys¬ 
tem, 266 

in directed mutagenesis, 295 
in ethylbenzoate degradation, 562 
in superoxide dismutase production, 

250 

in translation, 214 
a-Amylase 

heat resistant, 290, 571-572 
inhibitors of, for insect resistance, 766 
production of, 257-258 
secretion of, 232 

in starch hydrolysis, 569-573, 820 
(3-Amylase, in starch degradation, 570 
Amyloid bodies, in Alzheimer disease, 863- 
865 

Amylopectin 

modification of, 818-821 
in starch degradation, 570-571 
structure of, 569 
Amyloplasts, 741 
Amylose 

modification of, 818-821 
in starch degradation, 570-571 
structure of, 569 


Analytical (capture) protein microarrays, 
174-176 

Ancestry determination, DNA testing for, 
361-364 

Anchor primers, for ligation sequencing, 
132 

Anchoring enzymes, in SAGE method, 161 
Animal(s) 
feed for 

amino acid-enriched, 803-805 
phosphorus-enriched, 815 
phytase addition to, 885 
StarLink corn in, 929 
species determination of, 364 
transgenic, see also Mice, transgenic 
applications of, 845-846 
fish, 886, 888-890 
livestock, 871-885, 910-911, 925 
patents for, 917-918 
poultry, 885-887 
summary, 890 
terminology of, 846 
Annealing, of DNA fragments, 56 
Anthocyanin, in flower pigmentation, 822- 
824 

Anthranilate synthetase, in amino acid pro¬ 
duction, 515-516 
Antibiosis, 599 

Antibiotic(s), see also specific antibiotics 
commercial production of, 521-535 
designer, 534-535 
genes of 

cloning of, 523-526 
modulation of, 526-527 
improving production of, 531, 533-534 
market statistics for, 522 
novel, 522-523, 527-529 
for phytopathogen control, 612-614 
polyketide, 529-532 
resistance to, see Antibiotic resistance 
Antibiotic resistance 
biofilms in, 221-222 
from foods, 930-931 
in Helicobacter pylori, 496 
in marker gene removal, 754—755 
marker genes for, 227-228, 478^179 
plasmids for, 58 
Antibodies 
chimeric, 403^06 

complementarity-determining regions of, 
400-401, 411, 416-417 
dual-variable-domain, 420 
in ELISA, 335-336 
fragments of, 408—415 
full-length, libraries of, 415^16 
functions of, 400-402 
fusion protein, 207-208 
genes of, 443^44 

for genomic library screening, 77-78 
humanized, 403—406, 862-863 
modifying specificity of, 320-321 
monoclonal, see Monoclonal antibodies 
for nucleic acid delivery, 454—455 


polyclonal, 337 
primary, 339 
production of 
in plants, 827-829 
in transgenic mice, 862-863 
in protein microarray analysis, 174-176 
secondary, 339 

single-chain, for virus resistance, 781-782 
structures of, 400-402 
in transgenic animals, 876-879 
Antibody-dependent cell-mediated cytotox¬ 
icity, 402 

Antibody-dependent cellular inhibition, in 
malaria, 472 
Anticodons, 28-29 

Antifoaming agents, for bioreactors, 703- 
704 

Antifreeze proteins, 614, 616-617, 900-902 
Antifungal metabolites, of Pseudomonas, 612 
Antigen(s) 

fusion protein as, 207-208 
plants producing, 828, 830-832 
in protein microarray analysis, 175-176 
Antiparallel chains, DNA, 17 
Antisense DNA, for vegetable discoloration 
control, 816 

Antisense oligonucleotide(s), therapeutic, 
428-134 

Antisense RNA, 426—434 

in fruit ripening suppression, 797 
in viral resistance, 774—779 
Antithrombin, from milk of transgenic ani¬ 
mals, 875, 911 

oq-Antitrypsin, production of, 393-394 
AOX1 gene, in Pichia pastoris system, 255 
Apolipoprotein B 

elevated, antisense oligonucleotides for, 
431^32 

nucleic acid delivery to, 451-452 
production of, 711 
Apoptosis, inhibitors of, 279 
Apotyrosinase, in melanin production, 539 
APP gene mutations, in Alzheimer disease, 
863-865 

Apple, phytopathogen resistance in, 789 
Aptamers, 437-440, 455—456 
Apyrase, in pyrosequencing, 128 
Aquaculture 

antibody gene manipulation for, 443^44 
transgenic fish for, 886, 888-890 
Arabidopsis thaliana 

in L-cysteine production, 517 
insect toxicity of, 768 
as laboratory model, 769 
metal uptake in, 839-840 
in omega-3 fatty acid production, 808 
phytopathogen resistance in, 789 
in polyhydroxyalkanoate production, 830 
salt tolerant, 795 
in vitamin E synthesis, 809-811 
Arabinogalactans, 583 
Arabinose, in alcohol production, 

Zymomonas mobilis in, 593-594 
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Arginyl-tRNA synthetase, 26 
Artificial chromosomes 

bacterial, in library creation, 68-69 
human, 449^50 

yeast, see Yeast artificial chromosomes 
Arxula adeninivorans, expression systems 
for, 255-259 

Aryloxyphenoxypropionates, plant resis¬ 
tance to, 784 

Ascorbic acid, commercial production of, 
507-512 

Asparagine, in proteins, changing to other 
amino acids, 310-311 

Aspartate (3-semialdehyde dehydrogenase, 
in DNA vaccine production, 477 
Aspartic acid, in seed amino acid modifica¬ 
tion, 805 

Aspen, lignin modification in, 834—835 
Aspergillus 

expression systems for, 259-261 
infestations with, economic impact of, 
936 

Aspergillus awamori, glucoamylase of, 572 
Aspergillus fumigatus, in phosphorus 
uptake, 607-608 

Aspergillus giganteus, in phytopathogenic 
fusion protein production, 790-791 
Aspergillus niger, in fructose production, 571 
Assembler program, for DNA sequencing, 
134 

Astaxanthin, in flower pigmentation, 824 
Asticcacaulis excentricus, in Bacillus thuringi- 
ensis toxin production, 667-668 
Atelocollagen particles, for nucleic acid 
delivery, 453 

Atlantic salmon, transgenic, 888 
atoC gene product, in polyhydroxyalkano- 
ate production, 544 

ATP sulfurylase, in pyrosequencing, 128 
Attachment sequence, of bacteriophage 

genome, in recombinational cloning, 
178-180 

Attenuated vaccine(s), 480^186 
cholera, 481-482 
herpes simplex virus, 485-486 
Leishmania, 484-485 
Salmonella, 482-484 
Auristatin E, 421^422 
Autographa californica multiple nuclear 
polyhedrosis virus (AcMNPV) 
in baculovirus production, 680-681 
insect cell expression systems, 261-271 
mammalian cell expression, 275-278 
Automation 

of DNA analysis, 364-365 
of DNA synthesis, 98-99 
of PCR, 110 

Autonomous replicating sequence, 245 
Autoradiography 

in dideoxynucleotide procedure, 120-121 
in DNA hybridization, 73-76 
Autosomal DNA, 361 
Autotransporter pathway, 43 


Auxins, 601, 728-730 

Avenic acid, in siderophores, 834 

Avidin 

for DNA hybridization, 350-352 
insect toxicity of, 768 

Azotobacter, in polyhydroxyalkanoate pro¬ 
duction, 545 

Azotobacter chroococcum, as fertilizer, 600 
Azospirillum, in mineral uptake, 603 
Azuki bean weevil, 766 

B 

B cells, monoclonal antibodies derived 
from, 337-338 

B ions, in protein separation, 169 
Baby hamster kidney cells, 272 
BACE1 protein, transgenic mouse model 
for, 863-865 

Bacillus amyloliquefaciens 
in starch degradation, 570, 572 
subtilisin of, 316 

Bacillus brevis, in fructose production, 576 
Bacillus Calmette-Guerin vaccine, 493^94 
Bacillus circidans, xylanase of, 308-309 
Bacillus licheniformis, in herbicide-resistant 
plant production, 785-786 
Bacillus megaterium, as fertilizer, 600 
Bacillus mucilaginosus, in phosphorus 
uptake, 607-608 

Bacillus sphaericus, in Bacillus thuringiensis 
toxin production, 667-668, 676 
Bacillus stearothermophilus 
heat-resistant amylase in, 290 
in starch degradation, 572 
tyrosyl-tRNA synthetase of, 312-314 
Bacillus subtilis 

in alginate lyase production, 392 
in antibiotic production, 534 
DNA integration into, 224-227 
in fermentation, 699 
in hyaluronic acid production, 546-547 
subtilisins of, modifying multiple proper¬ 
ties of, 215 
Bacillus thuringiensis 

Cry proteins of, see Cry proteins and cry 
genes, of Bacillus thuringiensis 
protoxin of, 660-661, 670-671, 760-764 
sporulation of, 661-663 
subspecies of, 663-664 
Bacillus thuringiensis toxin, 653-677 
classes of, 654, 656 
cry gene transferal in, 664-665 
discovery of, 657-658 
genes of, 658-677 

improvement of biocontrol, 674-677 
insect resistance to, prevention of, 770- 
773 

mode of action of, 653-658 
mosquitocidal, 666-668 
nontarget insects and, 663-666 
plant root protection from, 668-670 
protease inhibitor with, 765-766 
protoxin of, 660-661, 670-671, 760-764 


resistance to, 671-674 
safety of, 927-929, 933-934 
from subspecies, 653-654 
synthesis of, 660-663 
target insects for, 663-666 
use of, 653-658 
Bacmids, 265-268 
Bacteria, see also specific bacteria 

as antigen delivery systems, 494^97 
biofilm formation by, 221-222 
cell disrupting procedures for, 714-716 
cloning vectors for, 57-68 
DNA transfer into, 93-94 
endophytic, 644-645 
as fertilizers, 600, 619 
genetically modified, release into envi¬ 
ronment, 900-902 
harvesting of, 711-714 
hemoglobin of, 220-221 
metagenomics of, 148-154 
nonculturable, metagenomic study of, 
148 

for nucleic acid delivery, 452^53 
plant growth-promoting, 599-651 
plant resistance to, 787-793 
protein secretion pathways of, 40^43 
replication in, 20 

restriction endonucleases of, 51-52 
transcription in, 33-37 
vaccines for, 492^94 

Bacterial artificial chromosomes, in library 
creation, 68-69 
Bacteriophage X 

in antibody production, 414—415 
in baculovirus-insect cell expression sys¬ 
tem, 267 

as cloning vector, 86-90 
cosmids of, 90-91 
FokI cleavage of, 320 
p R promoter of, 201 
in recombinational cloning, 178-181 
structure of, 87-88 
Bacteriophage Ml3 
in antibody production, 414—415 
oligonucleotide-directed mutagenesis 
with, 292-295 
protein III gene of, 211 
Bacteriophage PI, 90, 856-858 
Bacteriophage T4 
DNA ligase of 

in directed mutagenesis, 292-295, 297- 
298, 301-303 
in fermentation, 706-708 
in gene expression, 201 
lysozyme of, 305-306, 308, 791-792 
polynucleotide kinase of, 134 
RNA polymerase of, 304 
Bacteriophage T7, gene 10 promoter of, 198 
Baculovirus(es) 
for insect control, 677-681 
genetic engineering for, 679-681 
mode of action of, 677-679 
linearized genome of, 264-265 
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Baculovirus(es) (continued) 
in mammalian cell expression, 275-278 
pathophysiology of, 261-262 
structure of, 678 

Baculovirus-insect cell expression systems, 
261-271 

mammalian glycosylation in, 267, 269- 
270 

multiprotein complexes produced in, 
270-271 

promoters in, 262-263 
proteins produced in, 262 
target gene integration in, 265-268 
vectors for, 263-264 
yield improvement in, 264-265 
"Bait," in protein-protein interaction, 182- 
186 

BamHI restriction endonuclease, 52, 54—55, 
59 

bacteriophage X, 88-89 
in DNA cloning, 105 
Banana, edible vaccine antigens in, 831 
Barley a-amylase signal peptide coding 
sequence, 791 

Barley stripe mosaic virus, 780 
Base(s), in nucleotide structure, 15-17 
Base caller computer program, for DNA 
sequencing, 134 

Batch fermentation, 687-689, 708-711 
Beacons, molecular, 352-353 
Bean, see also Soybean 

nitrogen fixation in, 619-620, 629-630 
Beta-carotene ketolase, in astaxanthin pro¬ 
duction, 824-825 

Beta-carotene, plants enriched with, 811- 
812,925-926 

Betaine, for osmoprotection, 794 
Bicistronic vectors, 274 
Binary cloning vectors, 731-733, 776-777 
Biodegradable plastics (polyhydroxyal- 
kanoates), commercial production 
of, 542-545, 711 
Biodegradation 
of cellulose, 580-595 
pathways for, 556-569 
phytoremediation in, 641-647 
plasmids for, 558-559 
of starch, 569-580 
of xenobiotics, 551-556 
Biodiversity, 932-933 
Biofilm formation, 221-222, 391 
Biofluorescent and bioluminescent systems, 
for diagnosis, 341-345 
Bioinformatics 
definition of, 147 

for DNA microarray analysis, 156-157 
metagenomics, 148-154 
molecular databases, 146-148 
for protein-protein interaction, 185, 187 
summary, 189 
Biolistic delivery system 
for gene transfer, 736-738 
for vaccines, 473 


Biomass 

concentration of, in fermentation, 687- 
689 

utilization of, see Bioremediation and bio¬ 
mass utilization 
Biopolymer(s) 
adhesives, 539-541 
commercial production of, 535-547 
hyaluronic acid, 545-547 
melanin, 538-539 

polyhydroxyalkanoates, 542-545, 711, 830 
rubber, 541-542 
xanthan gum, 535-538 
Bioreactor(s), 701-705 
airlift, 702, 705-708 
bubble column, 701-705 
chicken eggs as, 886, 887 
designs of, 701-705 
for gene expression, 202-204 
mammalian cell expression in, 279-282 
mammary gland as, 873-875 
plants as, 825-830 
stirred-tank, 701-705 
two-stage, 705-708 

Bioremediation and biomass utilization, 
551-598 

lignin modification for, 835-836 
of xenobiotics, 551-569 
gene alteration in, 559-569 
plasmid transfer in, 557-559 
processes for, 551-556 
Biosensors, microbial, 343-345 
Biotin 

in cloning DNA sequences, 84 
in DNA hybridization, 350-352 
in DNA probe labeling, 75-76 
in isotope-coded affinity tag method, 
171-173 

in protein microarray analysis, 174 
Biotransformation, in commercial produc¬ 
tion, 4-5 

BiP protein, in mammalian cell expression, 
282 

Biphenyl dioxygenase, in trichloroethylene 
degradation, 563-564 
Bipyridyl, for siderophore growth, 611 
Bispecific diabodies, 408, 410 
Black spot, resistance to, 816 
Blackleg disease, 357 
Blastocysts, for transgenic mouse produc¬ 
tion, 851-855 

Blastoderm cells, for transgenic poultry 
production, 886, 887 

Bleaching process, peroxidase for, 325-326 
Blood group antigens, glycosidase altera¬ 
tions in, 394-395 

Blue mussel, byssal adhesive of, 539-541 
Blunt ends, in restriction endonucleases, 52, 
56,105,161 

Bohlen, Larry, StarLink corn studies of, 929 

Bolivar, F., 59 

Boll weevil, 766-767 

Bovine growth hormone, 905-907 


Bovine papillomavirus, as vector, 273 
Bovine spongiform encephalopathy, pre¬ 
vention of, 877-878 
Boxes, in transcription, 38^0 
Boyer, Herbert, 5-10, 918-919 
bphAl gene, Burkholderia cepacia, 562-564 
Bradyrhizobium, nodulation in, 639 
Bradyrhizobium japonicum 
hydrogenase of, 631-633 
in nitrogen fixation, 619-620 
Brassica, stearate desaturase gene of, in 
canola, 807 

Brassinosteroids, 836-837 
Brevibacterium, in amino acid production, 
515-519 

Bromoxynil, plant resistance to, 784, 787 
BsmFI restriction endonuclease, in SAGE 
method, 161 

Bsu36I, in baculovirus-insect cell expres¬ 
sion system, 264 

Bubble column bioreactors, 701-705 
Budding form, of baculovirus, 261-262 
Bull semen ribonuclease, 309-310 
Burkholderia, in nitroaromatic degradation, 
567-569 

Burkholderia cepacia 

antifungal enzymes of, 614 
in trichloroethylene degradation, 564 
Butyrate, in antibiotic production, 529-532 
By-product (exchanged-entry) plasmids, 

180 

Byssal adhesive, production of, 539-541 

c 

Cabbage looper, baculovirus effects on, 
678-679 

CABD sequence, in gene synthesis, 115 
Cadherins, in Bacillus thuringiensis toxin 
action, 677 

Cadmium, phytoremediation of, 839-840 
Caenorhabditis elegans 
double-stranded RNA in, 429 
in omega-3 fatty acid production, 884 
CAG nucleotides, in Huntington disease, 
867-870 

Calcium, binding to subtilisin, 316 
Calcium chloride, for Escherichia coli trans¬ 
formation, 92 

CAM (camphor-degrading) plasmid, 558 
CAMERA (Community Cyberinfrastructure 
for Advanced Marine Microbial 
Ecology and Analysis), 153-154 
Camphor-degrading plasmid, 558 
Cancer 

antibodies against, 421-422 
antisense RNA agents for, 428 
engineered RNase products for, 309-310 
immunotoxins for, 410 
ovarian, protein microarray analysis in, 
176 

Canola 

chitinase gene in, 789-790 
cross-pollination with field mustard, 933 


INDEX 


977 


genetically modified, regulation of, 907 
lysine-enriched, 805 
oil from, modification of, 805-808 
salt-tolerant, 795 

Capillary electrophoresis, for ancestry 
determination, 361 

Capping, in DNA synthesis, 100, 102,129, 
131 

Capture (analytical) protein microarrays, 
174-176 

Carboxylate group, of siderophores, 610 
Caries, dental, vaccine for, 479M80 
Carnation 

genetically modified, regulation of, 907 
pigmentation manipulation in, 824 
Carotenoids, plants enriched with, 811-812, 
925-926 

Cartegena Protocol on Biosafety, 908 
Casein, from transgenic animals 
in cystic fibrosis transmembrane regula¬ 
tor protein production, 873-874 
for improved cheese quality, 880 
Catabolic operons, 153 
Catabolite activator protein (cyclic AMP 
repressor protein), 197, 201 
Catalase, in superoxide anion destruction, 
792-793 

Catalytic activity, of enzymes, engineering 
of, 312-314 

Catechol, as biodegradation product, 552- 
555 

Catechol 2,3-dioxygenase, in ethylbenzoate 
degradation, 558-562 
Catecholate group, of siderophores, 610 
Cathepsin, in baculovirus-insect cell 
expression system, 270 

Cattle 

bovine growth hormone and, 905-907 
bovine spongiform encephalopathy in, 
877-878 
cloning of, 871 
mastitis in, 878-879, 906 
transgenic 

improved muscle mass in, 880-883 
methods for, 873 
milk from, 875, 879-880 
Cauliflower mosaic virus 35S promoter, 
743-745 

Caulobacter crescentus, in Bacillus thuringien- 
sis toxin production, 667 
CCAAT sequence, in transcription, 38 
cDNA, see Complementary DNA 
Cell(s) 

death of, conditional control of, 870-871 
microbial 

disrupting, 714-716 
harvesting of, 711-714 
Cell growth, decrease of, in metabolic load, 
234 

Cell wall, removal of, for DNA transforma¬ 
tion, 244 

Cellobiase ((3-glucosidase), action of, 584- 
586 


Cellobiohydrolase, action of, 584 
Cellobiohydrolase I gene, 259 
Cellobiose, 588 
Cellulase(s) 
cloning of, 586-589 
domains of, 587 
of eukaryotes, 586 
in fruit ripening, 797 
industrial uses of, 589 
production of, 259 
of prokaryotes, 583-586 
Cellulomonas fimi, gene manipulation in, 
588 
Cellulose 

in lignocellulosics, 581-583, 594 
sources of, 580-581 
structure of, 583 
utilization of, 580-595 
cellulase genes and, 583-589 
importance of, 580-581 
lignocellulosic, 581-583, 594 
Zymomonas mobilis in, 589-595 
Cellulosomes, 583, 589 
Centrifugation, for microbial cell harvest¬ 
ing, 711-712 

Cephalosporins, production of, 526, 533- 
534 

Cephamycins, production of, 526 
Cervical cancer, human papillomavirus 
vaccine for, 468M69 
CFTR gene, in cystic fibrosis, 366-367 
Chagas disease, diagnosis of, 349-350 
Chain terminators, reversible, in DNA 
sequencing, 128-131 

Chain-terminating inhibitors, in PCR, 110 
Chalcone synthase, of chrysanthemums, 
822 

Chaperones 

for inclusion body control, 219-220 
for protein folding, 240-242, 251-253 
Chaperone/usher pathway, 43 
Cheese 

chymosin for, 903-904 
from transgenic animals, 880 
Chemical(s), in environment, microbial 
degeneration of, 551-569 
Chemical inducers, of gene expression, 202 
Chemically linked monoclonal antibodies, 
417, 419-420 

Chemiluminescence, for DNA hybridiza¬ 
tion, 350-352 

Chicken(s), transgenic, 885-887 
Chicken (S-actin promoter, in omega-3 fatty 
acid production, 884 

Chimera, in transgenic poultry production, 
886-887 

Chimeric oligonucleotides (RNA-DNA 
molecules), 437 

Chimeric proteins, 303-304, 325 
Cry, 663 

in plant genetic engineering, 746 
in starch degradation, 820 
in transgenic mouse production, 867 


Chinese hamster ovary (CHO) cells, 272, 
275, 281, 284-285 
ChIP-on-chip assays, 160 
Chitinase 

in Bacillus tlmringiensis toxin action, 675- 
676 

in baculovirus-insect cell expression sys¬ 
tem, 270 

in phytopathogen control, 614, 788-790 
Chloroplast(s) 

Bacillus thuringiensis toxin gene in, 763- 
764 

genetic transformation of, 738-741 
marker gene removal from, 753-755 
in polyhydroxyalkanoate production, 830 
in salicylic acid production, 788-789 
Chlorosis, in flooding, 605 
Cholera 

attenuated vaccine for, 481M82 
edible vaccine for, 831-832 
fowl, resistance to, 877 
subunit vaccine for, 466 
vector vaccine for, 495 
Cholesterol, elevated 
antisense oligonucleotides for, 431-432 
nucleic acids for, 451M52 
Cholesterol oxidase, for insect resistance, 
766-767 

Choline, for osmoprotection, 794 
Chorismate, salicylic acid synthesis from, 
789 

Chromatin, 40, 282-283 
Chromatin immunoprecipitation method, 
160 

Chromatography 

column, for microbial cell harvesting, 714 
for proteomics, 169 
Chromoplasts, 741 
Chromosome(s) 

bacterial artificial, 68-69 
DNA integration into, 222-228, 234—235 
in eukaryotic cells, 40 
human artificial, 449—450 
replication initiation in, 20 
yeast artificial, see Yeast artificial chromo¬ 
somes 

yeast, for gene transfer, 738 
Chromosome 22, deletion of, in DiGeorge 
syndrome, 858 

Chrysanthemums, pigment manipulation 
in, 822 

Chymosin, recombinant, 903-904 
Chymotrypsin, Bacillus thuringiensis toxin 
interactions with, 676-677 
cl repressor, 202 

Cirrhosis, microarray analysis of, 158-160 
Citrase synthase, large-scale production of, 
202 

Citrate group, of siderophores, 610 
Clinical trials, process for, 385 
Clone(s) 
entry, 180 
expression, 180 
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Clone banks, see Genomic libraries 
Cloning 

gene, see Recombinant DNA technology 
multiple sites for, 64-65 
Cloning vector(s) 
bacteriophage X, 86-90 
binary, 731-733 
cointegrate, 733 
cosmids, 90-91 
for eukaryotic proteins, 80-86 
fusion protein, 206-207 
high-capacity, 92 
for large pieces of DNA, 86-92 
plasmid, 57-68 
promoter-tagging, 743 
shuttle, 67-68, 242-244 
Clostridium acetobutylicum, in isopropanol 
production, 577-578 

Clostridium beijerinckii, in isopropanol pro¬ 
duction, 577-578 

Clostridium tetani, vector vaccine for, 496 
Clostridium thermosulfurogenes, in fructose 
production, 576 

Coat proteins, for viral resistance, 773-779 

Cochaperones, 253 

Cochaperonins, 219-220 

Codex Alimentarius Commission, 911 

Coding strand, 213 

Codons 

Escherichia coli, 31-32 
in foreign vs. host genes, 235 
functions of, 28-30 
mRNA, mutagenesis of, 292-295 
in translation, 214 

Cofactors, of proteins, modifying require¬ 
ments for, 316 

Cognate (specific) modification enzymes, in 
restriction nuclease production, 502 
Cohen, Stanley, 5-10, 918-919 
Cohesive ends, of bacteriophage X, 87 
Cointegrate vector system, 733 
Colanic acid gene, deletion of, 222 
Colicin la, 411 

Collagen, for nucleic acid delivery, 453 
Colony collapse disorder, in honeybees, 934 
Color, of flowers, manipulation of, 821-825 
Colorado potato beetle. Bacillus thuringien- 
sis toxin effects on, 676-677 
Column chromatography, for microbial cell 
harvesting, 714 

Combinatorial cDNA libraries, 411^16 
Commercial product(s), 501-550, see also 

Large-scale production; specific prod¬ 
ucts 

adhesive protein, 539-541 
agricultural chemicals, 935-936 
alcohol, see Alcohol, production of 
amino acids, 514-519 
antibiotics, 521-535 
ascorbic acids, 507-512 
biopolymers, 535-547 
high-fructose corn syrup, 570, 575-576, 
819-820 


history of, 3-7 
hyaluronic acid, 545-547 
hydrogen gas, 595-596 
indigo, 512-514 
isopropanol, 577 
lipase, 505-506 
lycopene, 519 
melanin, 538-539 

polyhydroxyalkanoates, 542-545, 711 
restriction endonucleases, 501-504 
rubber, 541-542 

small biological molecules, 506-521 
succinic acid, 519-521 
summary, 545-547 
xanthan gum, 535-538 
Community Cyberinfrastructure for 

Advanced Marine Microbial Ecology 
and Analysis (CAMERA), 153-154 
Competence, Escherichia coli, 60 
Complement inhibitor, in organ transplan¬ 
tation, 876 

Complementarity-determining regions 

(CDRs), of antibodies, 321, 400^401, 
405^106, 411, 416-417 
Complementary base pairing 
of DNA, 16-18 
of RNA, 21-22 

Complementary DNA, 81-82 
amplification of, 113 
of cellulase, 586 

for DNA microarray analysis, 155-156 
of interferons, 380-381 
libraries for, 210-212 
patenting of, 916 

in protein-protein interaction, 182-183, 
186 

in SAGE method, 160-163 
for superoxide dismutase, 249-250 
synthesis of, 81-86 
Complementary strand, 213 
Complementation approach 
to antibiotic synthesis, 524 
to hydrogenase genetic engineering, 632- 
635 

Complementation assay, for protein-pro¬ 
tein interaction, 184 

Component I and component II, of nitroge- 
nase, 621-622 

Concatamers, in SAGE method, 161 
Confocal scanning microscope, in DNA 
microarray analysis, 156-157 
Conjugation, in bacteria, 93-94 
Consensus sequences, 203-204 
Constant regions, of antibodies, 401 
Constitutive secretion, of proteins, 44 
Contamination, environmental, dealing 
with, see Bioremediation and bio¬ 
mass utilization; Phytoremediation 
Contigs, in libraries, 149 
Continuous fermentation, 691-692 
Coprinus cinereus, peroxidase of, 325-326 
Copyrights, 912 

Corepressor, in transcription, 36 


Corn, 824 

fatty acid modifications in, 748 
field tests for, 900-902 
fructose and alcohol production from, 
570-576 

gene transfer in, 734—735 
genetically modified 
export of, 936-937 
nutritional content of, 925 
regulation of, 907, 909 
glyphosate N-acetyltransferase gene in, 
786 

insect-resistant, 934 
lysine-enriched, 805 
phytic acid-enriched, 815 
phytopathogen resistance in, 789 
RAPD fingerprinting of, 357 
RNA interference genes in, 769 
salt-tolerant, 795 
Star Link, 928-929 
vitamin E-enriched, 810-811 
Corn rootworm. Bacillus thuringiensis toxin 
effects on, 676-677 

Corn stearoyl-acyl carrier protein, 748 
Corn ubiquitin gene promoter, 780 
Coronary artery disease, antisense oligo¬ 
nucleotides for, 431 
Coronavirus, vaccine for, 466-467 
Corynebacterium 

in amino acid production, 514—519 
in ascorbic acid production, 507-512 
fructosyl-amino acid oxidase of, 322-323 
in lysine enrichment, 805 
Corynebacterium diphtheriae, toxin of, cell 
death due to, 870-871 
Corynebacterium glutamicum, in amino acid 
production, 515-519 
Cosmids, as cloning vectors, 90-91 
Cosuppression, 747, 822 
Cotransformation, of plant cells, 752-753 
Cotton 

genetically modified 
export of, 936-937 
regulation of, 907 
insect-resistant, 934, 936 
salt-tolerant, 795 

Cotton bollworm, gossypol in, 769 
4-Coumarate:coenzyme A ligase, in lignin 
synthesis, 834-835 

Coupling efficiency, in DNA synthesis, 103 
Cow, see Cattle 

Cowpea trypsin inhibitor, for insect resis¬ 
tance, 764-766 

Cowpea weevil, for insect resistance, 766 
Crabtree effect, 253 

Cr e-loxP "insertion-removal" system, for 
selectable marker genes, 227-228 
Cr e-loxP recombination system 
for myostatin alterations, 881-882 
for transgenic mouse production, 856- 
858 

Creutzfeldt-Jakob disease, prevention of, 
878 
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Crick, Francis, 16-18 
Criminal investigations 

animal species determination for, 364 
DNA fingerprinting for, 353-355 
DNA tests for, 361-364 
RAPD markers for, 356 
Crookneck squash, viral coat proteins in, 
777-778 

Cross-flow filtration, for microbial cell har¬ 
vesting, 713-714 
Crown gall disease, 739 
antibiotics for, 614 
Ti plasmid in, 726-730 
Cry proteins and cry genes, of Bacillus thur- 
ingiensis, 654, 656 
chimeric, 663 
classification of, 654, 656 
for improved biocontrol, 674-677 
modifying for increased toxicity, 665-666 
in protoxin, 670-671 
resistance and, 672-674 
safety of, 927-929 
toxicity of, 767 
transferring of, 664—665 
Cryptic plasmids, 58 
Cryptosporidium parvum, PCR monitoring 
of, 358 

Cucumber mosaic virus, coat protein of, 
776-778 

Cucumber necrosis virus, 781-782 
Cucumber, phytopathogen control on, 618 
Culture 

for fermentation, 693-694 
microorganisms not available through, 
148 

of monoclonal antibodies, 339, 341 
Curb gene, deletion of, 222 
Cyanamide, plant resistance to, 784 
Cyanidin derivatives, in flower pigmenta¬ 
tion, 823-824 

P-Cyanoethyl group, in DNA synthesis, 
100,103 

Cyanovirin N, for HIV infection, 398-399 
Cycle sequencing, DNA, 123-124,132-133 
Cyclic AMP repressor protein (catabolite 
activator protein), 197, 201 
Cyclic array sequencing, 136-142 
Cyclohexanediones, plant resistance to, 

784 

Cysteine 

in p-interferon, 311-312 
commercial production of, 517 
in lysozyme engineering, 305-306, 308 
Cystic fibrosis 
DNase I for, 389-390 
Pseudomonas aeruginosa infections in, 391 
screening for, 366-367 
Cystic fibrosis transmembrane regulator 
protein, in milk, 873-874 
Cytidine monophosphate-sialic acid, in 
baculovirus-insect cell expression 
system, 269-270 

Cytochrome P450 gene, silencing of, 769 


Cytokines, protein microarray analysis for, 
174-175 
Cytokinins 

in crown gall disease, 728-730 
for senescence suppression, 796 
Cytomegalovirus 

in baculovirus production, 681 
infections with 

antisense oligonucleotides for, 429M30 
oq-antitrypsin for, 394 
ribozymes for, 435 

Cytomegalovirus enhancer, in omega-3 
fatty acid production, 884 
Cytomegalovirus promoter, in mammalian 
cell expression, 272, 280-282, 284 
Cytosine (C) 
derivatization of, 99, 103 
in DNA structure, 15-17 
Cytotoxic T-lymphocyte antigen 4, in dental 
caries vaccine, 480 

D 

Daffodil, phytoene synthase gene of, 811- 
812 

Dairy products, see also Milk 
chymosin for, 903-904 
in xanthan gum production, 537 
Dalapon, plant resistance to, 784 
dam gene, in Salmonella vaccine production, 
482-484 

DAO gene, in marker removal, 753 
Databases 

for gel electrophoresis results, 167 
molecular, 146-148 
Ddel restriction enzyme, 504 
Deacetoxycephalosporin C, 534 
Deamidation, of amino acids, at high tem¬ 
peratures, prevention of, 309-310 
Death phase, of microbial growth, 687-689 
Deceleration phase, of microbial growth, 
687-689 

Degeneracy, sequence, 298-299 
Degradation, see Biodegradation 
Degradative plasmids, 58, 643-644 
Dehydrosqualene synthase, in antibiotic 
production, 535 

Deinococcus radiodurans, in radioactive- 
waste degradation, 566-567 
Delphinidin derivatives, in flower pigmen¬ 
tation, 823-824 
Denaturation, in PCR, 110 
Dental caries, vaccine for, 479M80 
Deoxyadenosine a-thiotriphosphate, in 
pyrosequencing, 128 

3-Deoxy-D-arabino-heptulosonate 7-phos¬ 
phate synthase, in amino acid pro¬ 
duction, 516 

Deoxyribonuclease I (DNase I) 
for DNA shuffling, 303-304 
production of, 389-390 
Deoxyribonucleotide triphosphate, 18-19 
Deoxyribonucleic acid, see DNA 
Deoxyribozymes, 436M37 


Department of Agriculture regulations 
for genetically modified crops, 908 
for release of genetically modified organ¬ 
isms, 901-902 

Derepression, of trp promoter, 197 
Designer antibiotics, 534-535 
Designer cellulosomes, 589 
Desulfovibrio desulfuricans, Ddel restriction 
enzyme of, 504 

Detergents, for microbial cell disruption, 
714-715 

Detritylation, in DNA synthesis, 100,103 
Dextrins, as starch degradation product, 
570, 573 

Diabetes mellitus, hemoglobin Ale in, 322- 
323 

Diabodies, 408, 410 

2.4- Diacetylphloroglucinol, for phytopatho¬ 

gen control, 612-613 

Dialysis, for protein purification, 717-718 
Dicamba, plant resistance to, 786-787 
Dicer (ribonuclease), in transgenic mouse 
production, 860-861 

Dicotyledonous plants, crown gall disease 
of, 726-730 

Dideoxynucleotide, structure of, 118-119 
Dideoxynucleotide procedure (Sanger), 
118-124,133 

Dihydrodipicolinic acid synthase, in seed 
amino acid modification, 805 
Dihydroflavonol 4-reductase, in flower pig¬ 
mentation, 824 

Dihydrofolate reductase-methotrexate sys¬ 
tem, 278-279 

Dihydrofolate reductase-thymidylate syn¬ 
thase, of Leishmania, 484M85 
Dihydrogranaticin, production of, 529 
Dihydrogranatirhodine, production of, 528- 
529 

Dihydrokaemferol, in flower pigmentation, 
824 

2-Dihydroxyisovalerate, in valine produc¬ 
tion, 518-519 

Dihydroxyphenylalanine, in melanin pro¬ 
duction, 538-539 

Diisopropylamine, in DNA synthesis, 100 

2.5- Diketo-D-gluconic acid, in ascorbic acid 

production, 507-512 
Dilution rate, in fermentation, 691-692 
Dimethoxytrityl group, in DNA synthesis, 
100 

Dimethylallyl diphosphate, in lycopene 
production, 519 

2,4-Dinitrotoluene, degradation of, 567-569, 
643-644 

Diphtheria toxin, cell death due to, 870-871 
Directed mutagenesis, 291-305 
DNA shuffling in, 303-304, 325-326 
for enzyme activity engineering, 312- 
314 

error-prone PCR in, 298 
examples of, 290-291 
oligonucleotide-directed 
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Directed mutagenesis (continued) 

for adding disulfide bonds, 305-306, 
308 

with M13DNA, 292-295 
PCR-amplified, 297-298 
with plasmid DNA, 295-297 
overview of, 291-292 
random 

for antibody engineering, 321 
with degenerate oligonucleotide prim¬ 
ers, 298-300 

with random insertion/deletion, 300- 
303 

summary, 327 

for thermostability improvement, 306, 
308-311 

unusual amino acids for, 304—305 
Discoloration, of fruits and vegetables, 815- 
817 

Discosoma, fluorescent protein of, 341-342 
Disposable elements, in plants, 752-753 
Disulfide bonds 
adding, 305-310 
formation of 

in bacteria, 218-219 
in Saccharomyces cerevisiae, 250-252 
stabilization of, 409^410 
Disulfide isomerase, 251-252 
Dithiothreitol, for fusion protein purifica¬ 
tion, 210 

DNA 

aptamers of, 437-440 
autosomal, 361 
chemical synthesis of, 98-108 
automated, 98-99 
phosphoramidite method, 99-103 
summary, 142-143 

synthetic oligonucleotides for, 103-108 
in chimeric RNA-DNA molecules, 437, 
746 

in chloroplasts, 738-741, 753-755 
circular, see also Plasmid(s) 
production of, 297-298 
restriction endonuclease map for, 55 
cloning of, see Recombinant DNA tech¬ 
nology 
coiled, 282 

complementary, see Complementary 
DNA 

degradation of, in digestive system, 930- 
931 

discovery of, 16 
double-stranded, 16,17, 81 
of eukaryotic cells, 40 
functions of, 17-18 
hairpin, 304 

injection of, for transgenic mouse pro¬ 
duction, 850-851 

integration into chromosomes, 222-228, 
234-235 

minisatellite, 354—355 
mitochondrial, in ancestry determination, 
361-362 


naked, in gene therapy, 449 
nuclear, marker gene removal from, 752- 
753 

paternal, 361 

plasmid, large-scale production of, 719 
random amplified polymorphic, 355-357 
replication of, 18-20, 22 
vs. RNA, 15 

RNA molecules transcribed from, 21-23 
sequences, patenting of, 916-917 
shuffling of, 303-304, 325-326, 382 
structure of, 14-18. 20 
synthesis of 
chemical, 98-108 
natural (replication), 18-20, 22 
summary, 45 

transferred, 727-733, 743-744, 752-753 
transferring into Escherichia coli, 92 
DNA diagnostic systems, 345-365, see also 
DNA hybridization; Polymerase 
chain reaction (PCR) 
for ancestry determination, 361-364 
for animal species determination, 364 
automated, 364-365 
for genetic disease, 365-375 
for malaria, 347-348 
microbial biosensors, 343-345 
molecular beacons, 352-353 
nonradioactive, 350-352 
random amplified polymorphic DNA, 
355-357 

for Trypanosoma cruzi, 349-350 
DNA fingerprinting, 353-355 
DNA hybridization 

allele-specific, for cystic fibrosis, 366-367 
for DNA microarray analysis, 156 
for genomic library screening, 70, 72-76 
nonradioactive, 350-352 
procedure for, 70, 72-76, 345-346 
DNA ligase, 56 
in fermentation, 706-708 
in gene expression, 201 
in nonhomologous-end-joining pathway, 
261 

in oligonucleotide-directed mutagenesis, 
292-295 

in random insertion /deletion mutagene¬ 
sis, 301-303 

DNA microarray analysis, 155-160 
commercial products for, 160 
data management in, 156, 158 
importance of, 160 
probes for, 155 
procedure for, 155-157 
purposes of, 158-160 
systems for, 155 

DNA modules, in transcription, 38^40 
DNA polymerase(s) 

in DNA sequencing, 134 
in DNA shuffling, 304 
in double-stranded DNA formation, 81 
in error-prone PCR, 298 
in oligonucleotide synthesis, 106,108 


in oligonucleotide-directed mutagenesis, 
292 

in random insertion/deletion mutagene¬ 
sis, 301 

in replication, 19-20 
DNA probe(s) 
diagnostic use of, 345-346 
for DNA microarray technology, 155- 
158 

for genomic library screening, 70, 72-76 
hybridization, 345-346 
labeling of, 73, 75-76 
mixed,103-104 
molecular beacons as, 352-353 
nonradioactive, 350-352 
for oligonucleotide ligation assay, 367- 
374 

padlock, 374 
sources of, 76 
TaqMan, 375 

DNA sequencing, 117-133 
applications of, 117-118 
automated, 121-124 
cycle sequencing, 123-124 
de novo, 133 

dideoxynucleotide procedure (Sanger) 
for, 118-124, 133 
large-scale, 133-142 
nanopore, 142 

primer-walking method for, 124-125 
pyrosequencing method for, 125-128, 

141 

resequencing projects, 133 
reversible chain terminators in, 128-131 
shotgun cloning strategy for, 133-136 
summary, 142-143 
DNA vaccine(s), 472^80 
delivery of, 472479 
dental caries bacteria, 479-480 
DNA-binding domain, in protein-protein 
interaction, 182-183 
DNase I, production of, 389-390 
Dockerin, in cellulosome, 589 
Dolly (cloned sheep), 871-873 
Double digestion, restriction endonucleases 
for, 53, 55 

Double-cassette vector(s), 274 
doublesex gene, 25 

Downstream processing, in fermentation, 
717-719 

Downstream region, of DNA, 23 
Doxycycline, in transgenic mouse produc¬ 
tion, 867, 869 

Drosophila, alternate gene splicing in, 25 
Drought, plants resistant to, 793-796 
Drug(s), see Pharmaceutical(s) 

Dsb proteins, for disulfide bond formation, 
218-219 

Dual-variable-domain antibodies, 420 
dUTPase, in directed mutagenesis, 294-295, 
298 

Dye(s), commercial production of, 512-514 
Dye transfer inhibitors, 325-326 
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E 

Economic issues 
in genome sequencing, 133 
in molecular biotechnology, 8, 935-937 
in molecular diagnostics, 333-334 
in patenting inventions, 912, 919 
EcoRI restriction endonuclease, 49-55 
Edible oils, modification of, 805-808 
Edible vaccines, 830-832 
Effector molecules, in transcription, 35-36 
Effector-repressor complex, in transcrip¬ 
tion, 34-35 

Egg(s) 

fish, transgenesis procedures for, 888-889 
livestock, nuclear transfer in, 871-873 
mouse, manipulation of, 850-851 
poultry, transgenesis procedures for, 885- 
887 

Elaioplasts, 741 
Electrophoresis 

capillary, for ancestry determination, 361 
gel, see Gel electrophoresis 
Electroporation 

in binary-vector preparation, 733-734 
in DNA transformation, 244 
in DNA vaccine production, 476^477 
in filamentous fungal systems, 260 
in prokaryotes, 93 

Electrospray ionization, mass spectrometry 
with, for proteomics, 167-169 
ELISA, see Enzyme-linked immunosorbent 
assay (ELISA) 

Elongation, 28-29, 285 
Elongation factor 2, in cell death, 870-871 
Embryonic stem cells 
for transgenic mouse production, 851- 
855 

for transgenic poultry production, 886- 
887 

Emulsion polymerase chain reaction, 142 
Endoglucanase, 584, 588-589 
Endonucleases 
definition of, 47 

modifying specificity of, 318-320 
restriction, see Restriction endonucleases 
Endophytes, bacterial, 644-645 
Endoplasmic reticulum, glycosylation in, 
242 

Endoproteases, engineering of, 314-315 
Enhancer sequences, in plant gene manipu¬ 
lation, 743 

5-Enolpyruvylshikimate-3-phosphate syn¬ 
thase, 930 

Enoylreductase, in antibiotic production, 
530-532 

Enterobacter agglomerans, antifungal 
enzymes of, 614 
Enterokinase, 210, 323-324 
Enteropetidase, increasing stability and 
specificity of, 323-324 
Enterotoxin, Vibrio cholerae, 466 
Entry clone, in protein microarray analysis, 
180 


Environment 

contamination removal from, see 

Bioremediation and biomass utiliza¬ 
tion; Phytoremediation 
genetically modified products in 
Bacillus thuringiensis toxin effect on 
nontarget insects, 933-934 
benefits of, 934-935 
biodiversity and, 932-933 
regulation of, see Regulations 
protection of, in fermentation, 693 
Environmental Protection Agency regula¬ 
tions 

for genetically modified crops, 908 
for genetically modified organisms, 901- 
902 

Enviropig, 885 

Enzyme(s), see also specific enzymes 
activity of, improvement of, 312-315 
alterations of, see Directed mutagenesis 
cell surface-expressed, in organophos- 
phate degradation, 564—565 
cofactors of, alteration of, 291 
engineering improvements in, see Protein 
engineering 

genomic libraries for, screening of, 79 
increasing stability and specificity of, 
321-324 

kinetics of, 312-314 
microarray analysis for, 176 
for microbial cell disruption, 715 
for phytopathogen control, 614 
reactivity of, alteration of, 291 
for recombinant DNA technology, 56 
replacement of, 445-446 
restriction, see Restriction endonucleases 
therapeutic, 389-395 
thermostability of, improvement of, 306, 
308-311 

Enzyme-linked immunosorbent assay 
(ELISA), 335-336 

vs. immunoquantitative real-time PCR, 
361 

monoclonal antibodies in, 339, 341 
Eosinophilia-myalgia syndrome, 904-905 
Epidermal growth factor, heparin-binding, 
in cell death, 870-871 
Epigenetic modifications, 283 
Epstein-Barr virus 

in mammalian cell expression, 281 
monoclonal antibodies against, 411 
Erect leaves, engineering of, 836-837 
Ereky, Karl, 3^ 

Erucic acid, in canola oil, 807 
Erwinia carotovora 
control of, 618 

lysozyme effects on, 791-792 
Erwinia herbicola 

in ascorbic acid production, 507-512 
in 2-keto-L-gulonate production, 540 
Erythromycin, production of, 529 
Escherichia coli 

in adhesive protein production, 541 


in alcohol production, 594 
in amino acid production, 515-519 
in 1-aminocyclopropane-carboxylate 
deaminase production, 799 
in apolipoprotein B production, 711 
in ascorbic acid production, 509, 511 
bacteriophage X of, 86-87 
bacteriophage PI of, 90, 856-858 
in baculovirus-insect cell expression sys¬ 
tem, 265-267 

choline dehydrogenase in, 794 
in chymosin production, 903-904 
cloned-gene expression in, 195-239 
cloning vectors for, 67 
codons of, 31-32 
competence of, 60 
cry genes of, 660 
DNA replication in, 20 
DNA transfer into, 92 
endoprotease of, 314—315 
in ethylbenzoate degradation, 562 
as experimental organism, 899 
in fermentation, 693-698, 700-701, 706- 
711 

in fructose production, 576 
functional antibody production in, 412- 
415 

in fusion protein production, 205-212 
glucuronidase reporter gene of, 742-743 
growth of, 689 

high-density cultures of, 693-694 
in human growth hormone production, 
384 

in hydrogen production, 595-596 
hydrogenase of, 633 
in indigo production, 512-514 
interferon polypeptide synthesis in, 410 
in isopropanol production, 577-578 
in lycopene production, 519 
in lysine enrichment, 805 
in metagenomic studies, 150-151 
for nucleic acid delivery, 452^453 
oligonucleotides of, directed mutagenesis 
of, 292-294 

in organophosphate degradation, 564— 
565 

outer membrane protein A, 211, 230-232 
oxygen-limited, protein synthesis in, 690 
peptide-glycan-associated lipoprotein of, 
211 

plasmids of, large-scale production of, 
719 

in polyhydroxyalkanoate production, 
542-545, 711 

in polyketide antibiotic production, 534 
promoters of, 196-205 
protease-deficient strain of, 220 
protein folding in, 217-219 
quiescent, 796-797 
in recombinational cloning, 180 
restriction endonuclease of, 49-50 
in restriction endonuclease production, 
502-504 
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Escherichia coli (continued) 
ribosome-binding site in, 213 
ribosomes in, 29 

Shiga toxin-producing, edible vaccine for, 
832 

sigma factor of, 33 

in starch hydrolysis fusion protein pro¬ 
duction, 820 
structural genes of, 33 
Ti plasmid replication in, 731-733 
in tissue plasminogen activator produc¬ 
tion, 915 

transfer RNA overproduction by, 214 
transformation of, 60-63 
translational errors in, 234 
in trehalose production, 794-795 
in trichloroethylene degradation, 562-564 
in tryptophan production, 904—905 
in vaccine production, 473-474, 494 
in xanthan gum production, 537-538 
Escherichia coli aminoacyl-tRNA syn¬ 
thetases, 304-305 

Estrogenic activity, monitoring of, 344, 889- 
890 

Ethanol, see Alcohol 

Ethical concerns, in patenting animals, 918 
Ethnicity, DNA testing for, 361-364 
4-Ethylbenzoate, microbial degradation of, 
559-562 
Ethylene 

action of, 602-605 
in flower wilting, 797-799 
in fruit ripening, 797-799 
in gene transfer, 735 
lowering of, 604 
nodulation and, 640 
stress, 604-605, 617-618 
synthesis of, 602, 621 
Ethylenediamine, of siderophores, 610 
Ethylenediaminetetraacetic acid, for micro¬ 
bial cell disruption, 715 
l,l-Ethylidenebis[L-tryptophan], toxicity of, 
904-905 

Euchromatin, 40, 282 
Eukaryote(s), see also Animal(s); Plant(s) 
cellulases of, 586 

cloning vectors for, high-capacity, 92 
genomic library for, 68 
proteins of 

cloning DNA sequences for, 80-86 
heterologous production of, 240-289 
secretion pathways of, 40-44 
transcription in, 23, 37-40 
translation in, 26-28 
Eukaryotic initiation factor 2a, 490 
European spruce sawfly, baculovirus effects 
on, 678 

European Union, food export to, 936-937 
Evolution, directed, see Directed mutagene¬ 
sis 

Exchanged-entry (by-product) plasmids, 

180 

Exoglucanase, 584-585, 588-589 


Exons, 23-24 

Exonucleases, definition of, 47 

Exponential phase, of PCR, 358 

Exports, of genetically engineered food, 
936-937 

Expressed sequence tags, patenting of, 916 

Expression clone, in protein microarray 
analysis, 180 

Expression vectors, 196, 201, 203-204, 214 
baculovirus, 263-264, 275-278 
design of, 272-275 
eukaryotic, 242-244 

for mammalian cell expression, 272-279 
Pichia pastoris, 255 
Saccharomyces cerevisiae, 245-248 
selectable markers for, 278 
in translation, 214 

External-loop airlift bioreactors, 702, 705- 
708 


F 

F plasmids, 58 
Fab fragments, 4014:02 
engineering of, 409 
modifying specificity of, 321 
production of, 710-711 
Factor IX, from milk of transgenic animals, 
875 

Factor X, for fusion protein purification, 
210, 229 

FadR protein, in polyhydroxyalkanoate 
production, 544 
FailSafe trademark, 912 
The Farm (painting), 11 
Farnesyl diphosphate, in antibiotic produc¬ 
tion, 535 

Fat(s) 

for nucleic acid delivery, 451452 
in plants, modification of, 805-808 
Fatty acids, modification of, 805-808 
Fc fragments 
engineering of, 408 

of human-mouse hybrid antibody, 403- 
404 

Fc region, of immunoglobulin, 401402, 480 
FDA regulations, see Food and Drug 

Administration (FDA) regulations 
Fed-batch fermentation, 689-690, 708-711 
Fenitrothion, microbial degeneration of, 
567-569 
Fermentation 

environmental safety in, 693 
inhibitors of, 694 
large-scale, 687-719, 706-708 
acetate reduction in, 698-701 
batch, 687-689, 708-711 
bioreactors for, 701-705 
continuous, 691-692 
downstream processing in, 717-719 
efficiency of, 692-701 
fed-batch, 689-690, 708-711 
generalized scheme for, 686 
high-density cell cultures for, 693-694 


microbial cell disruption in, 714—716 
microbial cell harvesting in, 711-714 
microbial growth and, 687-692 
plasmid stability in, 695-696 
protein secretion in, 696, 698 
quiescent Escherichia coli cells in, 696- 
697 

technical challenges in, 685-687 
two-stage in single stirred-tank reactor, 
708 

two-stage in tandem airlift reactions, 
706-708 
silage, 576-577 
Ferredoxin 

in herbicide resistance, 787 
in trichloroethylene degradation, 563-564 
Ferritin, plants enriched with, 813-814 
Fertilizers 
bacterial, 600, 619 
nitrogen content of, 619 
FhlA proteins, in hydrogen production, 
595-596 

Field mustard, cross-pollination with cano¬ 
la, 933 
Field tests 

for genetically modified crops, 907-910 
for genetically modified organisms, 900- 
902 

Filament protein, 211 

Filamentous fungi, expression systems for, 
259-261 

Filtration, membrane, for microbial cell har¬ 
vesting, 712-714 
Fingerprinting 
DNA, 353-355 
peptide, 167 

of plant cultivars, 355-357 
Fireflies, luciferase of, 342-343 
First-to-invent principle, 912 
Fish 

infections in, antibody gene manipula¬ 
tion for, 443444 
transgenic, 886, 888-890 
FK506, production of, 529-532 
Flavobacterium 
alginate lyase of, 391-392 
in organophosphate degradation, 564— 
565 

Flavobacterium okeanokoites, FokI of, 320 
Flavonoids, in nodulation, 636, 638 
Flavr Savr tomato, 797 
Flooding, plant response to, 605 
Flowers 

pigmentation alterations in, 821-825 
wilting regulation in, 796-799 
Fluorescein, for DNA hybridization, 351 
Fluorescent dyes 

for chain termination method, 129-131 
for dideoxynucleotide procedure, 121- 
122 

for DNA hybridization, 350-352 
for DNA microarray analysis, 155-160 
for endoprotease engineering, 314—315 
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for gel electrophoresis, for proteomics, 
169-171 

for genotyping, 374 
for ligation sequencing, 131-133 
for metagenomic studies, 151-153 
for PCR, 358-359 

for protein microarray analysis, 172, 174 
Fluorescent proteins, 341-342 

for environmental pollution monitoring 
in fish, 888-890 
as reporters, 742-743 
Fluorophores, in molecular beacons, 352- 
353 

Foaming, in bioreactors, 703-704 
FokI restriction endonuclease, 51, 318-320 
Folate, plants enriched with, 812-813 
Folding, of proteins 
facilitation of, 217-219 
in Saccharomyces cerevisiae systems, 246- 
248 

Fomivirsen, for cytomegalovirus infections, 
429-430 

Food, see also specific food plants, e.g., Rice; 
Potato; Tomato 
discoloration of, 815-817 
edible vaccines in, 830-832 
frozen, starches designed for, 819 
genetically modified 
allergens in, 927-930 
gene transfer to humans or intestinal 
microorganisms in, 930-931 
labeling issues in, 931-932 
nutritional alterations and, 924-926 
public acceptance of, 726 
regulations for, 903-911 
safety of, 923-932 
toxins in, 927-930 

nutritional content of, modification of, 
803-815 

sweetness of, 817-818 
Food and Drug Administration (FDA) reg¬ 
ulations 

for food and food ingredients, 903-911 
for fusion proteins, 209-210 
for genetically modified crops, 908 
for labeling, 931-932 
for transgenic livestock, 910-911 
Foot protein, of mussels, 541 
Foot-and-mouth disease virus, peptide vac¬ 
cine for, 470-471, 477 
Forensic investigations 

animal species determination for, 364 
DNA fingerprinting for, 353-355 
DNA testing for, 361-364 
RAPD markers in, 356 
Forkhead box Ol transcription factor, 
inhibitors of, 432 

Formate hydrogen lyase system, 595-596 
Formic acid, in hydrogen production, 595- 
596 

Fosmids, 90 

Foundation of Economic Trends, 901 
Fowl cholera, resistance to, 877 


FOXOl, inhibitors of, 432 

Franck, Richard, 147 

Frankia, in plant growth promotion, 600 

Freezing/frost 

food, starches designed for, 819 
for microbial cell disruption, 715 
proteins involved in, 614, 616-617, 900- 
902 

Fructans, for sweetness improvement, 817- 
818 

Fructose 

commercial production of, 570-576, 819- 
820 

fermentation of, acetate accumulation 
and, 698 

Fructosyl-amino acid oxidase, increasing 
stability and specificity of, 322-323 
Fruit(s) 

discoloration of, 815-817 
ripening of, 796-799 
sweetness of, 817-818 

Fumarate, in succinic acid production, 519- 
521 

Functional genomics, 154-163 
definition of, 154 

DNA microarray analysis, 155-160 
SAGE (serial analysis of gene expression) 
in, 160-163 

Functional protein microarray analysis, 
176-181 

Fungi, 609, see also specific fungi 
antibiotics for, 612, 614 
ethylene synthesis related to, 617-618 
expression systems for, 244—261 
Arxida adeninivorans, 255-259 
filamentous, 259-261 
Hansenula polymorpha, 255 
Pichia pastoris, 253-255 
Saccharomyces cerevisiae, 244-253 
Yarrowia lipolytica, 255 
phytopathogens of, 609 
plant resistance to, 787-793, 936 
Fusarium, phytopathogen resistance to, 
790-791 

Fusarium solani, in antibiotic production, 
533 

Fusion protein(s), 205-212 
albumin-growth hormone, 387-388 
albumin-interferon, 383 
antibodies against, 207 
for Bacillus thuringiensis resistance pre¬ 
vention, 771-773 
cleavage of, 208-210 
construction of, 205-206 
for edible vaccines, 832 
for fermentation, 708 
foot-and-mouth disease virus, 465^66, 
471 

histidine-tagged, 208 
for mammalian cell expression, 284 
for microbial degeneration, 564-565 
for monoclonal antibody preparation, 
338-339 


multidomain, 480 

for nucleic acid delivery, 454^55 

in periplasm, 229 

for phytopathogen control, 790-791 
for recombinant protein purification, 
207-208 

stability of, 205-206 
Streptococcus, 480 
surface display of, 210-212 
for sweetness improvement, 817-818 
for toxin resistance prevention, 672 
uses of, 206-208 
Fusion vector system, 206-207 
Fv fragments 
engineering of, 401, 409 
single-chain, 409, 696-697, 781-782 

G 

G-418 

as marker gene, 278 
in myostatin alterations, 881 
in transgenic mouse production, 853-855 
a-Galactosidase, 395 
p-Galactosidase 
in gene expression, 196-197 
large-scale production of, 202 
in plasmid cloning, 65-66 
production of, 207 
promoters for, 205 

in protein-protein interaction, 183-184 
Galactotransferase, in baculovirus-insect 
cell expression system, 267, 269-270 
Gateway (recombinational) cloning, 178- 
181 

GC box, in transcription, 38 
Gel electrophoresis, 53-56 
for dideoxynucleotide procedure, 120 
polyacrylamide, for proteomics, 166-169 
for protein expression profiling, 169-171 
two-dimensional differential in-gel, 169- 
171 

Gelatinization, of starch, 570 
GenBank database, 146 
Gene(s) 

antibody, 443^44 
assembly of, 105-108 
housekeeping, 37 
number of, 164-165 
silencing of, 440, 768-769 
structural, 23, 33, 37-38 
suicide, 66 

synthesis of, 105-108,113-117 
Gene a-IRES-gene (1 construct, 274—275 
Gene banks, see Genomic libraries 
Gene chips, see DNA microarray analysis 
Gene expression 

for DNA integration into chromosomes, 
222-228, 234-235 
in fungi, 244-261 

fusion proteins for, 205-212, see also 
Fusion protein(s) 

for increasing protein production, 201 
for increasing protein secretion, 228-232 
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Gene expression (continued) 

for increasing protein stability, 215-219 
large-scale systems for, 201-204 
for limiting biofilm formation, 221-222 
metabolic load and, 233-235 
microarray analysis of, 158-160 
in non -Escherichia coli systems, 204-205 
for overcoming oxygen limitation, 220- 
221 

profiling of, 169-172 
in prokaryotes, 195-239 
from strong regulatable promoters, 196- 
201 

summary, 233-235 

translation expression vectors for, 212- 
214 

"Gene machines," 98-99 
Gene pyramiding, for resistance preven¬ 
tion, 671 

Gene stacking, for Bacillus thuringiensis 
resistance prevention, 772-773 
Gene therapy 
history of, 446 

nucleic acid delivery for, 444^451 
vectors for, 447-451 
Genentech, 9-10, 915 
General secretion pathway, 42 
Genetic disease(s) 
gene therapy for, 444—451 
molecular diagnosis of, 365-375 
cystic fibrosis, 366-367 
fluorescence-labeled primers in, 374 
padlock probes for, 374 
PCR/OLA procedure, 367-374 
sickle-cell anemia, 367 
summary, 375-376 
TaqMan protocol for, 375 
treatment of, 445 

Genetic immunization, see DNA vaccine(s) 
Genome(s) 
in databases, 146-148 
number of genes in, 164-165 
sequencing of, 133 
cyclic array, 136-139 
shotgun cloning strategy for, 133-136 
X prize offer for, 142 
sizes of, 71 
Genomic(s) 
definition of, 147 
functional, 154-163 
Genomic libraries, 68-80 
creation of, 68-70, 92,148-149 
for DNA sequencing, 134 
for restriction endonuclease testing, 504 
screening of, 149-150 
by DNA hybridization, 70, 72-76 
by immunological assay, 76-78 
mixed probes for, 103-104 
by protein activity, 78-80 
Genotyping, with fluorescence-labeled PCR 
primers, 374 

Gentamicin resistance gene, in baculovirus- 
insect cell expression system, 265-266 


Germinal disc, of avian eggs, transgene 
injection into, 885 
Gibberellins, 836-837 
Glioma, antisense RNA agents for, 428 
Globin defects, in sickle-cell disease, 349, 
367-368 

Glomus mosseae, chitinase effects on, 790 
|3-1,3-Glucanase, for phytopathogen con¬ 
trol, 614, 788-789 
Glucoamylase 
heat-resistant, 572 
in starch hydrolysis, 570-576 
Glucoamylase A gene, 259 
Gluconeogenesis, inhibitors of, 432 
Glucose 

in alcohol production, 590-591 
in ascorbic acid production, 507-512 
in cellulose, 583 

in fermentation, acetate accumulation 
and, 694, 698-701 
as fermentation inhibitor, 694 
in fructose and alcohol production, 570- 
576 

in isopropanol production, 577-578 
in starch, 583 

as starch degradation product, 569 
in succinic acid production, 520-521 
in valine production, 518-519 
in xanthan gum production, 535-538 
Glucose isomerase, 570-571, 575-576, 820 
Glucose-6-phosphatase, inhibitors of, 432 
Glucose-6-phosphate, in phytic acid synthe¬ 
sis, 815 

Glucose-l-phosphate, in starch synthesis, 
818-819 

fS-Glucosidase, action of, 584-586, 588-589 
Glucosyltransferase, in dental caries vac¬ 
cine, 479^480 

Glucuronic acid, 545-547, 583 
Glucuronidase, as reporter gene product, 
742-743 

Glufosinate, plant resistance to, 784—786 
Glutamate, in folate synthesis, 813 
L-Glutamic acid, commercial production of, 
515-519 

Glutamine, substitution of, in streptokinase, 
317-318 

Glutamyl aminocyclopropane-carboxylate 
deaminase, ethylene and, 798-799 
Glutathione peroxide, for oxidative stress, 
793 

Glycans, proteins containing, 176 
Glycated proteins, increasing stability and 
specificity of, 322-323 
Glyceraldehyde phosphate dehydrogenase 
in alcohol production, 592 
from Aspergillus, 259 
in superoxide dismutase production, 250 
Glycine betaine, for osmoprotection, 794 
Glycoproteins 

herpes simplex virus, in vaccines, 463- 
464,488 

Pichia pastoris, 253-255 
rabies virus, 488 


Glycosaminoglycans, production of, 545- 
547 

Glycosidases, production of, 394—395 
Glycosyl groups, proteins containing, 176 
Glycosylases, 242 

Glycosylation, of proteins, 241-242 
in Arxula adeninivorans system, 256 
in baculovirus-insect cell expression sys¬ 
tem, 267, 269-270 
in Pichia pastoris system, 253-254 
for purification, 749-750 
Glycosyltransferases, 242 
Glyphosate, plant resistance to, 784 
Goats, transgenic, milk from, 875, 911 
Gold particles, for gene transfer, 736-738 
Golden rice, 811-812, 925-926 
Golgi apparatus 

glycosylation in, 242 
protein transport in, 44, 253-255 
Gossypol, insect toxicity of, 769 
Grain, see also Com; Rice; Wheat 

fructose and alcohol production from, 
570-576 

Grain legumes, amino acid modification in, 
803-804 

Gram-negative bacteria 
disruption of, 715 
harvesting of, 711-714 
protein secretion pathways of, 41^42 
restriction nucleases of, 502 
Gram-positive bacteria 
disruption of, 715 
harvesting of, 711-714 
protein secretion pathways of, 41 
Granaticin, production of, 528-529 
Granule-bound starch synthase promoter 
for starch modification, 820 
for vegetable discoloration control, 816 
Granulocyte colony-stimulating factor, pro¬ 
duction of, 698 

Grass, genetically engineered, 909 
Green beans, in ferritin production, 814 
Green fluorescent protein 
gene of, 284, 742-743 
rhizosecretion of, 749 
Growth 
microbial 

decrease of, in metabolic load, 234 
principles of, 687-692 
plant, bacteria promoting, see Plant 
growth-promoting bacteria 
Growth hormone 
bovine, 905-907 
human 

engineering of, 383-388 
long-lasting, 387-388 
structure of, 383-384 
treatment regimen for, 384—385, 387 
in transgenic fish, 888 
Guanine (G) 

derivatization of, 99, 103 
in DNA structure, 15-17 
Guidelines for Research Involving Recombinant 
DNA Molecules, 898-900 
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Gypsy moth. Bacillus thuringiensis toxin 
effects on, 657, 670 

H 

Had transcription factor, 253 
Haell restriction endonuclease, 55 
Haematococcus pluvialis, beta-carotene keto- 
lase of, 824-825 

Haemophilus parainfluenzae, restriction endo¬ 
nucleases of, 50, 55 
Hairpin DNA, 304 
Hairpin ribozymes, 434 
Hairpin RNA, 441-442, 860-861 
Hammerhead ribozymes, 434d36 
Hansenula polymorpha expression system, 
255 

Haplotypes, for ancestry determination, 
362-363 

HAT procedure, for monoclonal antibodies, 
338-339 

HD gene, mutations of, 867-870 
Heavy chains, of antibodies, 400-401, 404- 
405, 407, 412-413, 417, 862-863 
Heden, Carl Goran, 4 
HeLa cells, 275 

Helicobacter pylori, vector vaccine for, 496- 
497 

Helper plasmids, 733 
Helper virus, for transgenic mice produc¬ 
tion, 848 

Hemagglutinin, production of, 271 
Hemicellulose 
degradation of, 308-309 
in lignocellulosics, 581-583 
structure of, 583 
Hemoglobin 

bacterial, for oxygen limitation, 220-221, 
629 

defects of, in sickle cell disease, 349, 367- 
368 

in Escherichia coli protein synthesis, 690 
in nitrogen fixation, 621 
production of, 255 
Vitreoscilla, 533, 694, 838 
Hemoglobin Ale, measurement of, 322-323 
Hemophilia, gene therapy for, 446 
Hemorrhagic septicemia virus, vaccines for, 
443-444 

Heparin-binding epidermal growth factor, 
in cell death, 870-871 
Hepatitis B virus, vector vaccine for, 488 
Hepatitis B virus core antigen, in foot-and- 
mouth disease virus vaccine, 471 
Hepatitis C virus 

cirrhosis due to, microarray analysis of, 
158-160 

interferons for, 382-383 
Herbicide-resistant plants, 782-787, 932-935 
Herbicolin, for phytopathogen control, 612 
Herpes simplex virus 
attenuated vaccine for, 485d86 
infections with, ribozymes for, 435 
subunit vaccine for, 463d64 


in transgenic mouse production, 853-854 
vector vaccine for, 488 
Heterochromatin, 40, 282, 284-285 
Heterologous DNA probes, 76 
Heterologous protein production, in 
eukaryotic cells, 240-289 
baculovirus-insect system, 261-271 
fungus-based, 244-261 
mammalian, 271-286 
posttranslational modification of, 240- 
242 

summary, 286-287 
systems for, 242-244 
Heteromeric proteins, 21 
Hevea brasiliensis, mRNA from, 542 
High-capacity vectors, 861-863 
High-copy-number plasmids, 58,199 
High-fructose com syrup, 570, 819-820 
Hindlll restriction endonuclease, 50, 55, 59 
Hirudin, production of, 251 
Histidine, fusion proteins tagged by, 208 
Histone acetyltransferase, 284 
Histone deacetylase, 284 
Histone proteins, modification of, 282-284 
Hog, see Pig 

Hogness box, in transcription, 38 
Homogenization, for microbial cell disrup¬ 
tion, 715-716 

Homologous recombination, in transgenic 
mouse production, 851-855 
Homomeric proteins, 21 
Honeybee population collapse, 934 
Hormone(s) 

for erect-leaf phenotype, 836-837 
juvenile, in baculovirus action, 678-679 
for plant growth, 601, see also Ethylene 
for protein alteration, 38-39 
Housekeeping genes, 37 
Hpa restriction endonucleases, 50 
Human artificial chromosome, in gene ther¬ 
apy, 449^450 

Human embryonic kidney (HEK) cells, 272, 
275, 281 

Human growth hormone, see Growth hor¬ 
mone, human 

Human immunodeficiency virus infection 
dj-antitrypsin for, 394 
cytomegalovirus infections in, 429^30 
nucleic acid delivery in, 455 
therapeutic agents for, 398-399 
Human pancreatic RNase, adding disulfide 
bonds to, 309-310 

Human papillomavirus, vaccine for, 468- 
469 

Huntington disease, mouse model for, 867- 
870 

Hyaluronic acid, commercial production of, 
545-547 

Hybrid antibodies, human-mouse, 403^06 
Hybrid cells, in monoclonal antibody for¬ 
mation, 337-339 
Hybrid proteins, 303-304 
Hybridization 


allele-specific, for cystic fibrosis, 366-367 
DNA, see DNA hybridization 
Hybridoma cells, slow growth of, 411 
Hyc proteins, in hydrogen production, 595- 
596 

Hydrocarbons, microbial degradation of, 
557-559 

Hydrodynamic shearing, for DNA frag¬ 
mentation, 133-134 
Hydrogen 

metabolism of, 631-632 
production of, 595-596 
Hydrogenase, 630-635 
Hydrogranaticin, production of, 529 
Hydroxamate group, of siderophores, 610 
Hydroxyacetosyringone, in crown gall dis¬ 
ease, 727 

Hydroxycinnamyl alcohols, in lignin syn¬ 
thesis, 834-835 

p-Hydroxyphenylpyruvate dioxygenase, in 
vitamin E production, 809-811 
Hygromycin resistance gene, in plant gene 
expression, 744 

Hyperaccumulators, of metals, 839 
Hypermannosylation, in Pichia pastoris sys¬ 
tem, 253-255 

Hypervariable regions, of antibodies, 320- 
321 

Hypoxantine, aminopterin, thymidine 
(HAT) procedure, for monoclonal 
antibodies, 338-339 

I 

Ibritumomab, 402 

ICAT (isotope-coded affinity tag) method, 
for proteomics, 171-173 
Ice nucleation proteins, 614, 616-617, 900- 
902 

Identification, of unknown victims, DNA 
fingerprinting for, 354 
Imidazolinones, plant resistance to, 784 
Immunization, see also Vaccine(s) 
in vivo, 877 

Immunoaffinity process, for fusion pro¬ 
teins, 208 

Immunoglobulin(s) 
dual-variable-domain, 420 
Fc region of, in dental caries vaccine, 480 
production of, in plants, 827-829 
Immunoglobulin G, engineering of, 408- 
409 

Immunologic assays 
for diagnosis, 334-336 
for genomic library screening, 76-78 
real-time PCR with, 359-361 
Immunotoxins, 410^11 
Impingement, for microbial cell disruption, 
715-716 

Inclusion bodies 

due to incorrect folding, 217-219 
in fermentation, 696, 698 
prevention of, 219-220 
removal of, 208, 718 
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Indigo, commercial production of, 512-514 
Indole, in indigo production, 512-514 
Indole-3-acetic acid 
in crown gall disease, 728-730 
herbicides acting on, 786-787 
in plant growth, 601, 642-643 
Influenza virus 

subunit vaccine for, 270-271 
vector vaccine for, 488 
Information, see Bioinformatics 
Initiation codons, 30 
Initiation complex, 29 
Initiator elements, in transcription, 38^0 
Inositol hexaphosphate (phytic acid), 606- 
608, 814-815, 884-885 
Insect(s) 

in baculovirus-insect cell expression sys¬ 
tems, 261-271 

plant resistance to, 759-773, 932, 934-935 
oc-amylase inhibitors, 766 
Bacillus thuringiensis protoxin, 760-764, 
770-773 

cholesterol oxidase, 766-767 
protease inhibitors for, 764—766 
RNA interference, 768-769 
vegetative insecticidal toxins, 767 
Insecticide(s) 
chemical 

Bacillus thuringiensis toxin with, 671- 
672, 772 

microbial degradation of, 564-565, 
567-569 

vs. microbial insecticides, 632-633 
microbial, 652-684 
advantages of, 652-653 
Bacillus thuringiensis toxin, 653-677 
baculoviruses, 677-681 
"Insertion-removal" systems, for selectable 
marker genes, 227-228 
Insulin 

changing amino acid composition of, 311 
precursor of, 241 
production of, 245 
recombinant, 5 

Insulin B peptide, production of, 708-709 
Insulin-like growth factor I 
antisense RNA in, 428 
in milk, 906 

purification of, 718-719 
Insulin-like growth factor I receptor, in pso¬ 
riasis, 432 

Integration enzymes, in baculovirus-insect 
cell expression system, 267 
Interns, 210 

Intellectual property rights, 912, see also 
Patent(s) 

Interactomes, 185 
Interfering RNAs, 440^42 
Interferon(s) 

cDNA of, isolation of, 380-381 
gene shuffling in, 382 
hybrid, 382-383 
longer-acting, 382-383 


peptide of, 410 
production of, 381-383 
types of, 381-382 

vaccinia virus sensitive to, 490^91 
a-Interferon, production of, 303 
a-Interferon 2b, production of, 698 
p-Interferon, reducing free sulfhydryl resi¬ 
dues in, 311-312 

Interleukin-2, production of, 208, 229-230, 
232, 259 

Interleukin-10, production of, 396-397 
Internal ribosomal entry sites (IRES), 274- 
276 

Internal-loop airlift bioreactors, 702, 705 
International Food Biotechnology Council, 
903 

Internet, databases available in, 146-148 
Intestinal tract, transgenes in, 930-931 
Introns, 23-24, 81 
Inventions, patenting of, 911-919 
Irinotecan, 421-422 
Iron, in plants, see also Siderophores 
for growth, 608 
increasing content of, 833-834 
modification of, 813-814 
Isocaudomers, 52 

Isochromanequinones, production of, 528- 
529 

Isochrysis galbana, in long-chain fatty acid 
production, 808 

L-Isoleucine, commercial production of, 
518-519 

Isopenicillin N, production of, 525-526 
Isopentenyltransferase, in cytokinin synthe¬ 
sis, 796 

Isopentyl diphosphate, in lycopene produc¬ 
tion, 519 

Isopentyl pyrophosphate, in rubber pro¬ 
duction, 542 

Isopentyladenosine 5 -phosphate, in crown 
gall disease, 728-730 
Isopropanol, production of, 577-578 
Isopropyl-p-D-thiogalactopyranoside 

in baculovirus-insect cell expression sys¬ 
tem, 267 

in gene expression, 196-198 
in plasmid transformation, 65-66 
in selectable marker gene removal, 227- 
228 

Isoschizomers, 52 

Isotope-coded affinity tag (ICAT) method, 
for proteomics, 171-173 

J 

Jerusalem artichokes, in fructan production, 
818 

Jewish population, haplotypes of, 362-363 
Joint Genome Institute, 136 
Juvenile hormone, in baculovirus action, 
678-679 

K 

Kanamycin resistance gene 


in chloroplasts, 740-741 
in ethylbenzoate degradation, 562 
in Ti plasmid vector, 731 
2-Ketoisovalerate, in valine production, 
518-519 

2-Keto-L-gulonic acid, in ascorbic acid pro¬ 
duction, 507-512 
Ketosynthetase(s), 531-532 
Killer genes, in Ti plasmid vector, 731 
Kinetics, of enzyme activity, 312-314 
Kits 

for ancestry determination, 363-364 
for cystic fibrosis detection, 366-367 
K3L protein, of vaccinia virus, 490 
Klebsiella oxytoca, in alcohol production, 594 
Klebsiella ozaenae, nitrilase of, 787 
Klebsiella pneumoniae, nif genes of, 622-627 
Klenow fragment, 75, 292 
Knockout mice, 855 
Kozak sequence, 273-274 
Ku protein, in nonhomologous-end-joining 
pathway, 261 

L 

LI protein, human papillomavirus, 469 
Labeling, of genetically modified foods, 
931-932 

lac genes, of plasmid pUC19, 64—65 
lac promoter, 196-199, 201, 205 
in Fab fragment production, 710-711 
in polyhydroxyalkanoate production, 545 
lac repressor protein, 196 
lacl promoter, 199 

Lactate dehydrogenase, in mammalian cell 
expression, 280-281 
Lactic acid bacteria 

in silage fermentation, 576-577 
therapeutic use of, 395-399 
Lactobacillus, in cyanovirin N production, 
398-399 

Lactobacillus amylovorus, in silage fermenta¬ 
tion, 577 

Lactobacillus plantarum, in silage fermenta¬ 
tion, 577 
Lactococcus lactis 

in low-ethanol wine production, 574 
promoters for, 204—205 
therapeutic use of, 395-397 
p-Lactoglobulin, in milk, abolition of, 880 
Lactose 

in apolipoprotein B production, 711 
in Fab fragment production, 710-711 
in gene expression, 196-198 
reduction of, in milk, 880 
in xanthan gum production, 538 
ZflcUV5 promoter, 197 
lacZ gene 

in baculovirus-insect cell expression sys¬ 
tem, 264,266-267 
in fusion protein, 207 
Lag phase, of microbial growth, 687-689 
Large-scale production, see also Commercial 
product(s) 
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acetate reduction in, 698-701 
of DNA sequencing, 133-142, 685-722 
downstream processing in, 717-719 
efficiency of, 687-692 
fermentation systems for, see 
Fermentation, large-scale 
gene expression for, 201-204 
high-density cell cultures in, 693-694 
mammalian cell expression in, 279-282 
microbial cell disruption in, 714-716 
microbial cell harvesting in, 711-714 
plant systems in, 825-830 
of plasmid DNA, 719 
plasmid stability in, 695-696 
principles of microbial growth for, 687- 
692 

protein secretion in, 696, 698 
quiescent Escherichia coli cells in, 696-697 
summary, 720 

technical challenges in, 685-687 
Laundry 

lipases for, 505-506 
peroxidase for, 325-326 
Lead, phytoremediation of, 839-840 
Leader (signal peptide) sequence 
in protein secretion, 229-230 
Saccharomyces cerevisiae, 246, 250-251 
Leaves, erect, engineering of, 836-837 
Lectins 

protein microarray analysis for, 176 
toxic plants containing, 767-768 
Legal issues, in release of genetically modi¬ 
fied organisms, 901-902 
Leghemoglobin, 621, 629 
Leishmania, attenuated vaccine for, 484-485 
Lentiviruses, for transgenic mouse produc¬ 
tion, 848-850 

Lepidoptera, Bacillus thuringiensis toxin 
effects on, 672-674 
Leptin, production of, 397-398 
Leptosphaeria maculans, RAPD fingerprint¬ 
ing of, 357 

Leptospirillum, genomics of, 149 
Lettuce, in monellin production, 817-818 
LEU2 gene, for superoxide dismutase pro¬ 
duction, 250 

Leuconostoc lactis, promoters for, 205 
LexA protein, in mammalian cell expres¬ 
sion, 284 
Libraries 

antibody, for protein microarray analysis, 
176 

complementary DNA, 210-212 
genomic, see Genomic libraries 
monoclonal antibody, 411^16 
for protein-protein interaction, 185-187 
Ligase chain reaction assay, 371-374 
Ligation 

in cyclic array sequencing, 137 
DNA sequencing by, 131-133 
in oligonucleotide ligation assay, 367-374 
Light chains, of antibodies, 400-401, 404- 
405, 407, 412-413, 417, 862-863 


Lignin 

altering content of, 834-835 
in lignocellulosics, 581-583 
structure of, 582-583 
synthesis of, 834-835 
Lignocellulosic materials, 581-583, 594 
Linearization method, in baculovirus-insect 
cell expression system, 264-265 
Linkers 

for DNA cloning, 104-105 
for fusion protein purification, 209-210 
for random insertion/deletion mutagene¬ 
sis, 301-303 

Linking, in DNA synthesis, 100 
Linoleic acid, in plants, 805-808 
Linolenic acid, in plants, 805-808 
lip genes, of Pseudomonas alcaligenes, 506 
Lipase 

commercial production of, 505-506 
for phytopathogen control, 614 
Lipid(s) 

for nucleic acid delivery, 451^52 
in plants, modification of, 805-808 
Liquid chromatography, for proteomics, 

169 

Lithium acetate, for DNA transformation, 
244 

Liver damage, mouse model for, 870-871 
Livestock 

cloning by nuclear transfer, 871-873 
transgenic, 871-885 
disease-resistant, 876-879 
donor organ production by, 875-876 
methods for producing, 873 
milk quality improvement in, 879-880 
nutritional content of, 925 
pharmaceutical production in, 873-875 
regulation of, 910-911 
uses of, 873 

Logarithmic phase, of microbial growth, 
687-690 

LongSAGE protocol, 161 
Lovastatin, production of, 529 
Low-copy-number plasmids, 58,199, 202 
loxP gene 

for selectable marker gene removal, 227- 
228 

for transgenic mouse production, 856- 
858 

Luciferase, 342-343 
in pyrosequencing, 128 
as reporter gene product, 742-743 
Luminescent systems, for diagnosis, 341- 
345 

Luminometer, 344 

Lupine, amino acid modification in, 803- 
804 

lux system, 342-343 

Lycopene, commercial production of, 519 
Lymphoma, monoclonal antibodies for, 402 
Lysine 

replacement of, in streptokinase, 317-318 
in seed amino acid modification, 805 


Lysis, of microbial cells, 714-716 
Lysostaphin, for mastitis prevention, 879 
Lysozyme 

adding disulfide bonds to, 305-306, 308 
for microbial cell disruption, 715 
in phytopathogen control, 791-792 

M 

Macaque model, for neurodegenerative dis¬ 
ease, 869-870 

Macular degeneration, pegaptanib for, 439- 
440 

Mad cow disease, prevention of, 877-878 
Maize, see Corn 

Major secretory protein, Mycobacterium 
tuberculosis, 493 
Malaria 

diagnosis of, 347-348 
peptide vaccine for, 472 
MALDI (matrix-assisted laser desorption 
ionization), mass spectrometry with, 
167-169 

Malonyl aminocyclopropane-carboxylate 
deaminase, ethylene and, 798-799 
Maltose, as starch degradation product, 570 
Maltose-binding protein, 229-230 
Maltotriose, as starch degradation product, 
570 

Mammalian cells, expression systems for, 
271-286 

challenges in, 271-272 
importance of, 271-272 
plasmid integration in, 282-286 
productivity enhancement in, 279-282 
vectors for, 272-279 

MammaPrint microarray analysis system, 
160 

Mammary glands, see also Milk 
infection of, 878-879, 906 
pharmaceutical production in, 873-875 
Mannlieimia succiniproducens, in succinic 
acid production, 520-521 
Mannose 

fermentation of, acetate accumulation 
and, 698 

in Pichia pastoris system, 253-255 
Mapping 

of protein-protein interaction, 181, 183- 
188 

of restriction endonucleases, 52-56 
Marker genes 

in chloroplasts, 740, 753-755 
in mammalian expression, 278 
plants without, 750-751 
removal of, 227-228, 750-755 
in Saccharomyces cerevisiae systems, 246- 
248 

in Ti plasmid, 731-733 
in transformed plant cells, 741-743 
Marker peptides, 208 

Mass spectrometry, for proteomics, 167-169 
Mastitis, in cattle, 878-879, 906 
Maternal DNA, 361 
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Mating factor a, 246, 250 
Matrix-assisted laser desorption ionization 
(MALDI), mass spectrometry with, 
167-169 

Matrix-associated regions, 284-286 
Measles, oq-antitrypsin for, 394 
Meat, from genetically modified cattle, 
nutritional content of, 925 
Medaka (Japanese killfish) 
enteropetidase of, increasing stability and 
specificity of, 323-324 
for environmental pollution monitoring, 
888-890 

Medermycin, production of, 528-529 
Mederrhodine A, production of, 529 
Melanin, commercial production of, 538- 
539 

Membrane filtration, for microbial cell har¬ 
vesting, 712-714 

Mercury, phytoremediation of, 840-841 
Merozoites, in malaria, 472 
Messenger RNA (mRNA), 22 
aberrant splicing of, 432M33 
antisense, 426M34 
cellulase genes of, 586 
cloning of, 81-86 

codons of, mutagenesis of, 292-295 
in DNA microarray analysis, 155 
in flower pigmentation manipulation, 

822 

formation of, 24-35 
interferon cDNA from, 380-381 
patenting of, 916 
PCR monitoring of, 359 
ribosome-binding site in, 212-213 
structure of, 15 

targeted alterations to, 747-748 
transcription of, see Transcription 
translation of, in eukaryotes, 240-242 
tRNA interactions with, 27-29 
Metabolic load/drain/burden, 233-235 
causes of, 233 
consequences of, 234 
minimization of, 234-235 
of plasmids, 222-223 
Metabolomics, in amino acid production, 
517-518 

meta-cleavage pathway, for 4-ethylbenzoate 
degradation, 559-562 
Metagenomics, 148-154 
bioinformatic systems for, 153-154 
clone identification in, 151-153 
definition of, 148 
effective use of, 149 
libraries in, 79,148-150 
objectives of, 148-149 
robotic systems in, 150-151 
summary, 189 

Metal(s), plant uptake of, 646-647 
Metal cofactors, of proteins, modifying 
requirements for, 316 
Metallothionein, in ferritin production, 

814 


Methanococcus jannaschii, in directed muta¬ 
genesis, 304-305 

Methanol, for Pichia pastoris system, 253- 
255 

Methanol oxidase promoter, 255 

Methionine 

lupine deficient in, 803-804 
in mugineic acid production, 834 

Methotrexate, in dihydrofolate reductase- 
methotrexate system, 278-279 

Methyl a-glucosidase, in fermentation, 
698-699 

Methyl mercury, phytoremediation of, 840- 
841 

2'-0-Methyl phosphorothioate linkage, in 
antisense oligonucleotides, 433 

Methylation, of DNA, 51-52, 502, 504 

2- C-Methyl-D-erythritol 4-phosphate path¬ 

way, in lycopene production, 519 

O-Methyl-L-tyrosine-tRNA synthetase, 304 

4-Methyl-5-nitrocatechol, microbial degra¬ 
dation of, 567-569 

3- Methyl-4-nitrophenol, microbial degener¬ 

ation of, 567-569 

Mice 

for human antibody production, 407M08 
for hybrid human-mouse antibody pro¬ 
duction, 403-406 

mouse double-mutant 2 protein from, 

279 

mouse myeloma cells from, 272 
transgenic, 847-871 
alphaherpesvirus model for, 866 
Alzheimer disease model for, 863-865 
applications of, 863-871 
conditional control of cell death in, 
870-871 

conditional regulation of transgene 
expression in, 866-870 
Cr e-loxP recombination system for, 
856-858 

cystic fibrosis transmembrane regula¬ 
tor protein production in, 73-874 
DiGeorge syndrome model for, 858 
diphtheria toxin model for, 870-871 
DNA microinjection method for, 850- 
851, 862 

engineered embryonic stem cell meth¬ 
od for, 851-855 

high-capacity vectors for, 861-863 
human antibody production in, 862- 
863 

Huntington disease model for, 867-870 
identification of, 850-851 
knockout, 855 

in lysostaphin production, 879 
methodology for, 847-863 
in myostatin alterations, 881-883 
in omega-3 fatty acid production, 884 
patents for, 917-918 
pseudorabies virus model for, 865-866 
retinitis pigmentosa model for, 855, 857 
retroviral method for, 848-850 


RNA interference for, 858-861 
as test systems, 865-866 
tetracycline regulatory system in, 866- 
870 

thymidine kinase production in, 853- 
854, 862 

XenoMouse, 863 

Michaelis constant, alterations of, 290-291 
Microarray analysis 

DNA, see DNA microarray analysis 
protein, 172,174-181 
Microbial biosensors, 343-345 
Microbial cell(s) 
disrupting, 714—716 
harvesting of, 711-714 
Microfluidizer, for microbial cell disrup¬ 
tion, 716 

Microinjection, of DNA, for transgenic 
mouse production, 850-851, 862 
Microparticles, in DNA vaccine production, 
475M76 

Microprojectile bombardment, for gene 
transfer, 736-738 

Micro-RNAs, for virus resistance, 782 
Microscope, confocal scanning, in DNA 
microarray analysis, 156-157 
MIDGE (minimalistic immunogenically 

defined gene expression) vectors, in 
DNA vaccine production, 478-479 

Milk 

bovine growth hormone effects on, 905- 
907 

from genetically modified corn-fed cows, 
nutritional content of, 925 
from transgenic animals 
antibodies in, 877-879 
cystic fibrosis transmembrane regula¬ 
tor protein in, 873-874 
drugs produced in, 846, 911 
improving quality, 879-880 
lactose content decrease in, 880 
nutritional content of, 925 
pharmaceuticals in, 873-875 
Milling, for microbial cell disruption, 715- 
716 

Minimal polyketide synthase, 531-532 
Minisatellite DNA, 354-355 
Mismatch, in oligonucleotides, 292, 295 
Mitochondria, genetic transformation of, 
738-741 

Mitochondrial DNA, in ancestry determi¬ 
nation, 361-362 
Mixing 

in bioreactors, 701-705 
in fermentation, 693 
Molecular beacons, 352-353 
Molecular biotechnology 
benefits of, 10-11 
commercialization of, 6, 8-10 
emergence of, 3-5 
history of, 5-7 
process stages in, 4 
social concerns about, 11-12 
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summary, 12 

Molecular breeding, 303-304, 325 
Molecular cloning, see Recombinant DNA 
technology 

Molecular databases, 146-148 
Molecular diagnostics, 333-378 
biofluorescent and bioluminescent sys¬ 
tems in, 341-345 
economic issues in, 333-334 
of genetic disease, 365-375 
ideal qualities of, 333 
immunological, 334-336 
monoclonal antibodies, see Monoclonal 
antibodies 

nucleic acid, 341-345, see also DNA 
hybridization; Polymerase chain 
reaction (PCR) 

DNA fingerprinting in, 353-355 
summary, 375-376 

Monarch butterfly. Bacillus thuringiensis 
toxin effects on, 933-934 
Monellin, for sweetness improvement, 817- 
818 

Monoclonal antibodies 
anticancer, 421^22 
chemically linked, 417, 419-420 
chimeric, 403^06 
for diagnosis, 337-341 
dual-variable-domain, 420 
in ELISA, 335-336 
formation of, 337-339 
fragments of, 408M15 
full-length, libraries for, 415-416 
functions of, 400-402 
genes of, 443^44 
humanized, 403^408 
hybrid cell line identification in, 339-341 
hybrid human-mouse, 403^06 
libraries of, combinatorial, 411-416 
for nucleic acid delivery, 454^55 
production of 

in transgenic chickens, 886 
in transgenic mice, 862-863 
selection of, 337-339 
shuffling CDR sequences in, 416^18 
structures of, 400-402 
therapeutic, 399^422 
for transplant rejection, 402^03 
Monocotyledonous plants, gene transfer in, 
734-735 

Monooxygenase tyrosinase, in melanin pro¬ 
duction, 538-539 

Mortierella alpina, in long-chain fatty acid 
production, 808 

Mosquito larvae. Bacillus thuringiensis toxin 
effects on, 666-668, 676 
Mouse, see Mice 

Mouse double-mutant 2 protein, 279 
Mouse myeloma cells, 272 
mRNA, see Messenger RNA (mRNA) 

MspI restriction endonuclease, 52 
MSTN gene, engineering of, for improving 
muscle mass, 880-883 


Mucosal immunity, 476 
Mucosal vaccines, 476, 491, 495^96 
Mugineic acid, in siderophores, 834 
Multicopy episomal DNA, 281 
Multidomain fusion proteins, 480 
Multiple cloning sites, 64-65 
Multiprotein complexes, in baculovirus- 
insect cell expression system, 270- 
271 

Municipal waste, lignocellulosics from, 
581-583 

Muscle mass, improvement of, in transgen¬ 
ic animals, 880-883 

Muscular dystrophy, gene therapy for, 450- 
451 

Mutagenesis 

directed, see Directed mutagenesis 
induced, 4 

random, 298-303, 321 
myc gene, patent for, 917 
Mycobacterium bovis vaccine, 493-494 
Mycobacterium tuberculosis, vector vaccine 
for, 492-494 

Mycolytransferase, Mycobacterium tuberculo¬ 
sis, 493-494 

Mycotoxins, economic impact of, 936 
Myeloma cells, for monoclonal antibody 
formation, 337-339 
Myostatin 

defects of, gene therapy for, 450^51 
manipulation of, for improving muscle 
mass, 880-883 

Mytilus edulis, byssal adhesive of, 539-541 
Mytilus galloprovincialis, adhesive of, 541 

N 

naat genes, in iron production, 834 
NADH, in ascorbic acid production, 511- 
512 

NAH (naphthalene-degrading) plasmid, 

558 

Naked DNA, in gene therapy, 449 
Nanopore sequencing, 142 
Naphthalene-degrading plasmid, 558 
National Human Genome Research 
Institute, 133 

National Institutes of Health, regulations 
of, 898-900 

Nebulization, for DNA fragmentation, 133 
Nectin-1 receptor, in pseudorabies virus 
protection, 866 

Neomycin phosphotransferase, 277, 278 
as reporter, 744 
in Ti plasmid vector, 731 
in vector vaccine production, 488 
Neomycin resistance gene, 204 
Neonatal scours, resistance to, 877 
Neoschizomers, 52 

Neurofibrillary tangles, in Alzheimer dis¬ 
ease, 863-865 
Neuron(s), genes in, 39 
Neuron-restrictive silencer element, 39-40 
Neuron-restrictive silencer factor, 39-40 


Neurotoxins, in baculovirus production, 
680-681 

Neutralizing antigens, in vaccine produc¬ 
tion, 494-495 

Nick(s), in DNA, sealing of, 60 
Nickel, plant uptake of, 647 
Nicotiana tabacum, chimeric gene expression 
in, 762 

Nicotinamide adenine dinucleotide, in low- 
ethanol wine production, 574-575 
Nicotine synthase, 838 
nif genes, in nitrogen fixation, 622-630 
NitR protein, in antibiotic production, 527 
Nitrilase 

in antibiotic production, 527 
in herbicide-resistant plants, 787 
Nitroaromatic compounds, microbial 
degeneration of, 567-569 
4-Nitrocatechol, microbial degradation of, 
567-569 

Nitrogen fixation, 619-630, see also 
Nodulation 

bacteria involved in, 619-621 
energy requirements of, 619, 630 
hydrogen gas production in, 630-635 
impact on growth, 603 
in metabolic load, 234 
nitrogenase in, 621-628 
oxygen levels in, 628-629 
poly-p-hydroxybutyrate modulation in, 
629-630 
summary, 648 
Nitrogenase, 621-628 
components of, 621-622 
genetic engineering of, 622-627 
hydrogen gas production by, 630-635 
oxygen level engineering for, 628-629 
4-Nitrophenol, microbial degeneration of, 
567-569 

Nlalll restriction endonuclease, in SAGE 
method, 160-161 
nod genes, 635-640 
Nodulation, 635-640 
ethylene and, 640 
genetic engineering of, 635-639 
organism competition in, 635 
in Rhizobium meliloti, 617 
Noncoding strand, 213 
Nonhomologous random recombination, 
304 

Nonhomologous-end-joining pathway, 261 
Nopaline, in crown gall disease, 729-730 
Normalization, in DNA microarray analy¬ 
sis, 158 

Nostoc ellipsosporum, cyanovirin of, 398-399 
NPR1 gene, in phytopathogen control, 789 
NSFnet, 146 

Nuclear antigen 1, in mammalian cell 
expression, 281 

Nuclear power plant wastes, microbial deg¬ 
radation of, 565-567 

Nuclear transfer, for livestock cloning, 871- 
873 
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Nucleic acids, as therapeutic agents, 426- 
458, see also DNA; RNA 
antibody genes, 443-444 
antisense RNA, 426-434 
aptamers, 437-440 
chimeric RNA-DNA molecules, 437 
delivery of, 444^156 
interfering RNAs, 440^42 
ribozymes, 434^37 
summary, 456 

Nucleocapsids, of baculovirus, 678 
Nucleoproteins, in DNA vaccine produc¬ 
tion, 474—475 

Nucleotides, in DNA structure, 14-15,19 
Nutritional content 

of food, modification of, 924-926 
of plants, modification of, 803-815 
amino acids, 803-805 
iron, 813-814 
lipids, 805-808 
phosphorus, 814—815 
vitamins, 808-813 

o 

Occlusion body, of baculovirus, 678 
OCT (octane-degrading) plasmid, 558 
Octane-degrading plasmid, 558 
Octopine, in crown gall disease, 729-730, 
739 

Oils 

edible, modification of, 805-808 
petroleum, microbial degradation of, 
557-559 

OKT3 antibody, 402-403 
OLA (oligonucleotide ligation assay), 367- 
374 

Oleic acid, in plants, 805-808 
Oleosins, 748-749 

Oligodeoxynucleotides (deoxyribozymes), 
436-437, 525-526 

(2'-5')-01igoisoadenylate synthetase, in 
interferon production, 382 
Oligonucleotide(s) 
addition to cDNA, 113 
antisense, therapeutic, 428^34 
as aptamers, 437-440 
chemically synthesized 

automatic production of, 98-99 
phosphoramidite method for, 99-103 
single-stranded, 103-104 
uses of, 103-108 
chimeric, 437 
complementary, 105 
in DNA microarray analysis, 155-157 
in DNA sequencing, 131-133 
in fusion protein purification, 209-210 
in gene synthesis, 113-117 
overlapping, 106-108 
in padlock probes, 374 
Oligonucleotide ligation assay, 367-374 
Oligonucleotide primers, degenerate, ran¬ 
dom mutagenesis with, 298-300 
Oligonucleotide-directed mutagenesis 


for adding disulfide bonds, 305-306, 308 
for enzyme activity engineering, 312-314 
with M13 DNA, 292-295 
for modifying metal cofactor require¬ 
ments, 316 

PCR-amplified, 297-298 
with plasmid DNA, 295-297 
Omega-3 fatty acids 
in meat animals, 883-884 
in plants, 805-808 

Omega-6 fatty acids, in meat animals, 883- 
884 

ompF gene, in fusion protein, 207 
OncoMouse, patents for, 918 
Oomycin, for phytopathogen control, 612 
Open reading frames (ORFs), in protein 
microarray analysis, 176-181 
Operator region, 33-35 
Operons, 33, 35-36,153 
Opines, in crown gall disease, 728-730, 

739 

ORFeomes, 176-177 
ORFs (open reading frames), in protein 
microarray analysis, 176-181 
Organ(s), for transplantation, from trans¬ 
genic animals, 875-876 
Organic compounds, phytoremediation of, 
841 

Organisation for Economic Co-operation 
and Development, 908 
Organophosphate, microbial degradation 
of, 564-565 

Organophosphate lyase, 564-565 
Origin of replication, 20 
Ornithine transcarbamylase deficiency, 
gene therapy for, 446 

ortho-cleavage pathway, for biodegradation, 
555 

OsDWARF4 genes, rice erect-leaf phenotype 
and, 837 

Osmoprotectants, for plants, 793-796 
Osmotic shock, for microbial cell disrup¬ 
tion, 715 

Outer membrane protein A, in fusion pro¬ 
teins, 211, 230-232 

Outer membrane protein F, in fusion pro¬ 
teins, 211 

Ovalbumin, from transgenic chickens, 
pharmaceuticals in, 886 
Ovarian cancer, protein microarray analysis 
in, 176 

Oxidation, in DNA synthesis, 102 
Oxidative stress, plants resistant to, 792-793 
Oxygen 

in fermentation, 692-694 
limitation of, overcoming, 220-221 
for nitrogen fixation, 628-629 
within plant, 837-838 

P 

p53 tumor suppressor protein, 279 

Padlock probes, 374 

PaeR7I restriction endonuclease, 52 


Palm oil, modification of, 805-808 
Palmitic acid, in plants, 805-808 
Pancreatic RNase, human, adding disulfide 
bonds to, 309-310 
Panitumumab, production of, 863 
Pantoea agglomerans, in lycopene produc¬ 
tion, 519 

Papaya, genetically modified, regulation of, 
907 

Papaya ringspot virus, 779 
Parasitic infections, diagnosis of, 334, 347- 
350 

Parasporal crystal, of Bacillus thuringiensis, 
654-658, 660-661, 667 
Parathion, microbial degeneration of, 567- 
569 

Partial digestion, Sau3AI restriction endo¬ 
nuclease in, 59 

Patatin type I promoter, for vegetable dis¬ 
coloration control, 816 
Patent(s), 911-919 
categories of, 913-914 
in different countries, 912-913, 915 
for DNA sequences, 916-917 
fundamental research and, 918-919 
importance of, 912 
for multicellular organisms, 917-918 
process for, 912-913 
rejection of, 913 
requirements for, 912-913 
for "superbug," 557-558, 914 
for transgenic animals, 917-918 
Paternal DNA, 361 
Paternity determination 
DNA fingerprinting for, 354 
DNA testing for, 361-364 
Pathogen(s), plant, see Phytopathogen(s) 
Pathogenesis-related protein, of Bacillus 
thuringiensis, 770-773 
Pathogen-related proteins, 788-789 
PBR322 plasmid cloning vector, 59-60, 62 
PCP3 plasmid, 201, 202 
PCR, see Polymerase chain reaction (PCR) 
Pea, insect resistance in, 766 
Pegaptanib, therapeutic, 439-440 
PEGylation, of interferons, 383 
Penicillin(s), production of, 259, 526 
Penicillin N, production of, 534 
Penicillium, expression systems for, 259 
Penicillium chrysogenum, in antibiotic pro¬ 
duction, 525-526 
Peptide(s) 

identification of, mass spectrometry for, 
167-169 
marker, 208 
signal, 229-230 

Peptide mass fingerprinting, 167 
Peptide vaccine(s), 469^72, 477 

foot-and-mouth disease virus, 470^71, 
477 

malaria, 472 

Peptide-glycan-associated lipoprotein, in 
fusion proteins, 211 



INDEX 


991 


Periplasm 

protein secretion into, 229-230 
protein transport to, 43^4 
Periwinkle, insect toxicity of, 768 
Peroxidase, modifying multiple properties 
of, 325-326 

PEST sequences, protein stability and, 216 
Pesticides, see also Insecticide(s) 

disposal of, see Bioremediation and bio¬ 
mass utilization; Phytoremediation 
herbicide-resistant plants and, 782-787, 
932-935 

microbial degradation of, 564—565 
pET vectors, 198 

Petroleum, microbial degradation of, 557- 
559 

Petunia, genetically modified, 824, 907 
pH, for fermentation, 693 
Phagemids, 211 
Pharmaceutical(s), 380-388 
growth hormone, 383-388 
human interferons, 381-383 
interferon cDNA for, 380-381 
production of 

economic impact of, 935-936 
in transgenic livestock animals, 873- 
875 

in transgenic plants, 909-910 
in transgenic poultry, 886 
safety concerns about, 900 
tumor necrosis factor alpha, 388 
"Pharming," 846 

Phenazine-l-carboxylic acid, for phyto¬ 
pathogen control, 612-613 
Phenoxycarboxylic acids, plant resistance 
to, 784 

Phenylalanine, in anthocyanin synthesis, 
822-823 

Phenylalanine ammonia lyase, production 
of, 392-393 

Phenylketonuria, 392-393 
Phenylpropane units, in lignin, 582-583 
Phenylpropanoidbenzenoid compounds, in 
flower pigmentation, 824 
Phosphatase, Pichia pastoris, 255 
Phosphate groups 
in DNA structure, 14—15, 19 
in RNA structure, 15 
Phosphinothricin, plant resistance to, 784 
Phosphite triester, oxidation of, 102 
Phosphodiester linkage 

in antisense oligonucleotides, 428^31 
in pyrosequencing, 125-128 
Phosphoenolpyruvate carboxylase, inhibi¬ 
tors of, 432 
Phospholipase A, 231 
Phosphoramidite linkage, in antisense oli¬ 
gonucleotides, 430^31 
Phosphoramidite method, for DNA synthe¬ 
sis, 99-103 

Phosphorothioate linkage, in antisense oli¬ 
gonucleotides, 428^430 
Phosphorus 


availability of, bacteria promoting, 606- 
608 

as fermentation inhibitor, 694 
in plants, modification of, 814-815 
uptake of, 603 

Phosphorylation, in DNA synthesis, 103 
Phosphotyrosine, proteins containing, 176 
Photorhabdus luminescens, insect toxicity of, 
768 

Phytase, 606-608 
in ferritin production, 814 
production of, 255 
Phytate/phytic acid 
breakdown of, 606-608 
in phosphate utilization, 884—885 
plants enriched with, 814—815 
Phytoene synthase and phytoene desatu- 
rase, in vitamin A synthesis, 811-812 
Phytoextraction, 839-841 
Phytohormones, 601, see also Ethylene 
for erect-leaf phenotype, 836-837 
Phytopathogen(s) 
bacterial, 787-793 
control of, 608-619 
antibiotics for, 612-614 
enzymes for, 614 
ethylene for, 617-618 
ice nucleation and antifreeze proteins 
for, 614, 616-617, 900-902 
root colonization in, 618-619 
siderophores for, 608-612 
fungal, see Fungi 
viral, 773-782 
Phytoremediation 
definition of, 838-839 
genetic engineering for, 641-647, 838-841 
bacterial endophytes in, 644—646 
degradative plasmids in, 643-644 
for growth facilitation, 641-643 
for metal removal, 646-647 
Phytostabilization, 839-841 
Phytostimulation, 841 
Phytotransformation, 841 
Phytovolatilization, 839-841 
Pichia pastoris, expression systems of, 253- 
255 
Pig 

alphaherpesvirus infections in, mouse 
model for, 866 

pseudorabies virus infections in, 865-866 
transgenic 

donor organs from, 875-876 
omega-3 fatty acid synthesis in, 884 
phosphorus excretion from, 884-885 
phytase gene in, 885 

Pigmentation, of flowers, manipulation of, 
821-825 

pill protein, of bacteriophage M13, 211 
Pilus gene, deletion of, 222 
Pilus protein, 211 

Pink bollworm, toxin resistance in, 772-773 
Pink stem borer, in rice, 765 
pKK233-2 expression vector, 214 


p L promoter, 196, 201 
Plant(s) 

amino acid content of, 803-805 
antibody production in, 827-830 
appearance improvement in, 815-821 
as bioreactors, 825-830 
chloroplast engineering in, 738-741 
cotransformation of, 752-753 
cultivars of, RAPD markers of, 355-357 
discoloration of, 815-817 
dwarf and semidwarf varieties of, 836- 
837 

edible vaccines from, 830-832 
erect leaves in, 836-837 
fermentation of, 576-577 
flowers of 

pigmentation alterations in, 821-825 
wilting regulation in, 796-799 
gene transfer to, 735-738 
genetic engineering of 
methodology of, 725-758 
for quality improvement, 803-844 
reasons for, 725-726 
for stress resistance, 759-802 
summary, 755-756 
targeted, 745-747 

growth of, bacteria promoting, 599-651 
hormones of, 601 
hyperaccumulating, 839 
iron content of, 813-814, 833-834 
lignin content of, 834-836 
lignocellulosics in, 581-583, 594 
lipid alteration in, 805-808 
manipulation of gene expression in, 743- 
750 

gene targeting for, 745-747 
promoters for, 743-745 
protein purification for, 748-750 
RNA alterations in, 747-748 
marker genes in, 750-751 
marker-free, 750-755 
nutritional content modification in, 803- 
815 

oxygen content of, 837-838 
phosphorus content of, 814—815 
in phytoremediation, 641-647, 838-841 
pigmentation manipulation in, 821-825 
polymer production in, 830 
for remediation, 641-647 
reporter genes in, 741-743 
resistance in 
bacteria, 787-792 
drought, 793-796 
fungi, 787-792 
herbicide, 782-787, 932-935 
oxidative stress, 792-793 
salt, 793-796 
virus, 773-782 
salt-resistant, 793-796 
starch content of, 818-821 
stress in, 604-605 
sweetness of, 817-818 
taste modification in, 815-821 
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Plant(s) (continued) 

Ti plasmid transformation of, 726-735 
transgenic, see Plant(s), genetic engineer¬ 
ing of 

vitamin content of, 808-813 
yield optimization in, 832-838 
Plant growth-promoting bacteria, 599-651 
antibiotics of, 612-614 
antifreeze proteins of, 614, 616-617, 900- 
902 

enzymes of, 615 
ethylene of, 617-618 
free-living, 600-608 
hydrogenase of, 630-635 
ice nucleation proteins of, 614, 616-617, 
900-902 

in nitrogen fixation, 619-630 
in nodulation, 635-640 
in pathogen control, 608-619 
in phosphorus availability, 606-608 
in phytoremediation, 641-647 
research on, 600 
root colonization of, 618-619 
siderophores of, 608-612 
in stress reduction, 604-605 
summary, 648 
Plant oil bodies, 748-749 
Plantibodies, 827-829 
Plasmid(s), 57-58 

in attenuated vaccine production, 481- 
482 

bacillus Calmette-Guerin, 493^94 
CAM (camphor-degrading), 558 
cloning vectors for, 57-68 
pBR322, 59-60, 62 
pUC19, 64-65 
requirements for, 66-67 
selection of, 60-63 
shuttle, 67-68 
transformation of, 60-63 
cryptic, 58 
definition of, 57-59 
degradative, 58, 643-644 
DNA, large-scale production of, 719 
DNA loss from, in metabolic load, 234 
in DNA vaccine production, 474-A78 
exchanged entry (by-product), 180 
F, 58 

helper, 733 

high-copy-number, 58 
host range of, 59 
incompatibility of, 58 
instability of, 196 

integration of, into mammalian cells, 
282-286 

low-copy-number, 58 
metabolic load of, 222-223 
multiple compatible degradative energy¬ 
generating, 571 
NAH7, 512-514 

NAH (naphthalene-degrading), 558 
natural vs. genetically engineered, 59 
OCT (octane-degrading), 558 


oligonucleotide-directed mutagenesis 
with, 295-297 
pBR322, 512 
pCB103, 734 
pCB301, 733 
pCP3,196, 201, 202 
pWWO, 559-562 
R (resistance), 58 
SAL (salicylate-degrading), 558 
size of, 58 

stability of, in fermentation, 695-696 
Ti, see Ti (tumor-inducing) plasmid 
TOL (toluene-degrading), 558 
transfer of, in bacteria, 93-94, 557-559 
2pm, 244-245, 250 
types of, 58 

in vector vaccine production, 486-487 
XYL (xylene-degrading), 558 
Plasmin, streptokinase digestion by, 317-318 
Plasmodium, vaccine for, 472 
Plasmodium falciparum 
detection of, 347-348 
protein of, 212 

Plastics, biodegradable (polyhydroxyal- 
kanoates), commercial production 
of, 542-545, 711, 830 
Plastids, 741 

Pneumatic bioreactors, 701-705 
Pokeweed antiviral protein, 780-781 
Poliovirus, as vaccine vector, 491 
Pollen, lack of chloroplast DNA in, 740 
Pollination, gene transfer in, 933 
Pollutants, environmental 
biosensors for, 343-345 
dealing with, see Bioremediation and bio¬ 
mass utilization; Phytoremediation 
monitoring in fish, 888-890 
Polony (polymerase colony), in cyclic array 
sequencing, 137 
Poly(A) tail, 27, 81-86 

Polyacrylamide gel electrophoresis, for pro- 
teomics, 166-169 

Polyamide linkage, in antisense oligonucle¬ 
otides, 430-431 

Polyclonal antibodies, 336-337 
Poly-P-hydroxybutyrate, in nitrogen fixa¬ 
tion, 629-630 

Poly(3-hydroxybutyrate-co-3- 

hydroxyvalerate) copolymer, pro¬ 
duction of, 542-545 

Poly(3-hydroxybutyric acid), production of, 
542-545, 830 

Polycyclic aromatic hydrocarbons, phytore¬ 
mediation of, 642 

Polydeoxycytidylic acid, in cloning DNA 
sequences, 84-85 

Polydeoxyguanylic acid, in cloning DNA 
sequences, 84 

Polydeoxythymidylic acid, in cloning DNA 
sequences, 84 
Polyethylene glycol 
interferon attachment to, 383 
for monoclonal antibody preparation, 338 


Polygalacturonase, in fruit ripening, 797 
Polyglutamine repeats, in Huntington dis¬ 
ease, 868-870 

Polyhedrin gene, of baculovirus, 262-263 
Polyhedron structure, of baculovirus, 261, 
678 

Polyhydroxyalkanoates, commercial pro¬ 
duction of, 542-545, 830 
ris-l,4-Polyisoprene (rubber), commercial 
production of, 541-542 
Polyketide antibiotics, 529-532 
Polyketide synthases, 530-532 
Polylinker, in Ti plasmid vector, 731 
Poly-L-lysine, in gene therapy, 449 
Polymer(s), see also Biopolymer(s) 
production of, in plants, 830 
Polymerase chain reaction (PCR), 108-117 
cycles of, 109-113 

in cyclic array sequencing, 136-139 
in cystic fibrosis detection, 367 
in DNA microarray analysis, 155 
in DNA sequencing, 142 
in DNA shuffling, 303-304 
emulsion, 142 

essential components of, 108-109 
in full-length cDNA amplification, 113 
in gene synthesis, 113-117 
in genotyping, 374 
in mutagenesis 
error-prone protocol, 298 
oligonucleotide-directed, 297-298 
oligonucleotide ligation assay with, 367- 
374 

patents for, 912 
real-time, 358-361 
reverse transcription, 359 
templates for, 111-113 
in transgenic mouse production, 854-855 
for Trypanosoma cruzi, 349-350 
Polynucleotide kinase, for DNA sequenc¬ 
ing, 134 
Polypeptides 
precursor, 241 
as protein subunits, 21 
Polyphenol oxidases, in food discoloration, 
816 

Population studies, DNA testing for, 361- 
364 

Positive-negative selection, in transgenic 
mouse production, 852-853 
Posttranslational modification, 29, 240-242 
Potato 

chitinase gene in, 790 
discoloration control in, 816 
edible vaccine antigens in, 831-832 
genetically modified 

nutritional content of, 927 
regulation of, 909 
lysozyme gene in, 791-792 
pokeweed antiviral protein in, 780-781 
polyphenol oxidase of, 816 
starch of, modification of, 818-821 
toxins in, 927 
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Potato beetle. Bacillus thuringiensis toxin 
effects on, 676-677 

Potato proteinase inhibitor II gene, for 
insect resistance, 765 

Potato virus X, in antibody production, 829 
Poultry, transgenic, 884-886 
Power plant wastes, microbial degradation 
of, 565-567 
p E promoter, 201 

PR-la gene, of Bacillus thuringiensis, 770- 
773 

Prepro-a-factor sequence, 251 
Preproinsulin, 241 

Presqualene diphosphate, in antibiotic pro¬ 
duction, 535 

"Prey," in protein-protein interaction, 182- 
186 

Pribnow box, 33 
Primary antibodies, 339 
Primates, transgenic, neurodegenerative 
disease models of, 869-870 
Primer(s) 
anchor, 132 

in cyclic array sequencing, 137 
in ligation sequencing, 132 
PCR 

for genotyping, 374 
in oligonucleotide-directed mutagene¬ 
sis, 297-298 
uses of, 104 

Primer-adaptors, in cloning DNA 
sequences, 84 

Primer-walking strategy, 124-125 
Prion diseases, prevention of, 877-878 
Probes, see DNA probe(s) 

Probiotics, lactic acid bacteria as, 395-397 
Projectile bombardment, for gene transfer, 
736-738 

Prokaryotes, see also Bacteria 
cellulases of, 583-586 
gene expression in, 195-239 
genomic library for, 68-70 
protein folding in, 241 
protein secretion pathways of, 40^43 
transcription in, 23-24, 26-27 
transformation of, 92-94 
translation in, 26-29 

Prolactin receptor, growth hormone bind¬ 
ing to, 386 
Promoter(s) 

for Arxula adeninivorans, 257-258 
for filamentous fungi, 259-261 
lac, 196-199, 201, 205 
for Lactococcus lactis, 204-205 
for large-scale systems, 201-204 
leaky, 199 

for Leuconostoc lactis, 205 

for mammalian cell expression, 273 

p L , 202 

for plant genetic transformation, 743-745 
for Saccharomyces cerevisiae, 244—246 
stationary-phase, 203 
strong regulatable, 196-205 


tac, 196-198, 204 
in transcription, 23, 38 
try, 196-198 

Promoter-tagging vectors, 743 

Pronucleus, in transgenic mouse produc¬ 
tion, 850-851 

Propionate, in antibiotic production, 529- 
532 

Prostate carcinoma, antisense RNA agents 
for, 428 

Protamine, 454-A55 

Protease(s) 

as fermentation by-product, 689-690 
host strains deficient in, for oxygen limi¬ 
tation, 220 
inhibitors of 

for phytopathogen control, 788 
for plant insect resistance, 764-766 
for phytopathogen control, 614 
sensitivity of, decreasing, 317-318 

Protein(s), see also specific proteins 

activity of, for genomic library screening, 
78-80 

chimeric, 303-304, 325 
cleavage of, 29 
definition of, 20 
degradation of, 215-216 
engineering of, see Protein engineering 
of eukaryotes, cloning DNA sequences 
for, 80-86 

fluorescent, 341-345 
folding of, 240-242 
facilitation of, 217-219 
in Saccharomyces cerevisiae systems, 
246-248 
functional, 21 
functions of, 20-21 
fusion, see Fusion protein(s) 
gene expression in, profiling of, 169-172 
half-lives of, 215-216 
heteromeric, 21 
homomeric, 21 
hybrid, 303-304 
identification of, 165-169 
insoluble, recovery of, 718-719 
interaction among, mapping of, 181,183- 
188 

luminescent, 342-343 
microarray analysis of, 172, 174-181 
production of, gene expression regula¬ 
tion for, 201 

purification of, 717-719, 748-750 
secretion of 

in fermentation, 696, 698 
increasing, 228-232 
in mammalian cell expression, 282 
into medium, 230-231 
pathways for, 40^4 
into periplasm, 229-230 
from Saccharomyces cerevisiae, 250-253 
yields of, 228 
separation of, 165-169 
solubilization of, 718-719 


specificity of, modifying, 318-321 
stability of 

increasing, 215-219 
intrinsic, 215-216 
structures of, 20-21 
study of. See Proteomics 
suicide, 66 

sweet-tasting, 817-818 
synthesis of, see also Transcription 
summary, 45 

translation of, see Translation 
Protein C, from milk of transgenic animals, 
875 

Protein disulfide isomerase, 219, 282 
Protein engineering, 305-326 
adding disulfide bonds, 305-310 
altering multiple properties, 325-326 
changing asparagine to other amino 
acids, 310-311 

decreasing protease sensitivity, 317-318 
increasing enzymatic activity, 312-315 
increasing enzyme stability and specifici¬ 
ty, 321-324 

modifying metal cofactor requirements, 
316 

modifying protein specificity, 318-321 
reducing free sulfhydryl residues, 311— 
312 

summary, 327 

Protein microarray analysis 
analytical (capture), 174-176 
antibodies in, 174—176 
functional, 176-181 
methods for, 172, 174 
purpose of, 172 
reverse-phase, 176 

dj-Proteinase inhibitor, production of, 259- 
260 

Proteomics, 164—188 
complicating factors in, 165 
definition of, 147 
importance of, 164—165 
microarray analysis in, 172-181 
profiling expression in, 169-172 
protein-protein interaction mapping in, 
181-188 

separation and identification in, 165-169 
shotgun, 169 
summary, 189 

Protocatechuate, as biodegradation prod¬ 
uct, 552-555 

Protonation, of amino acids, 166 
Protoplasts, gene transfer to, 735-736 
Providencia stuartii, PstI restriction endonu¬ 
clease of, 502-504 

PrP proteins, in prion diseases, 877-878 
Pseudobactin, 610-611 
Pseudomonas 
antibiotics of, 612-614 
in ethylbenzoate degradation, 558-562 
exotoxin of, 410 

fluorescent, siderophores of, 610-611 
in xenobiotic degradation, 552 
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Pseudomonas aeruginosa 
alginate lyase of, 391-392 
outer membrane protein F of, 211 
Pseudomonas alcaligenes, lipase of, 506 
Pseudomonas asplenii, for phytoremediation, 
642 

Pseudomonas diminuta 

in antibiotic production, 533 
in organophosphate degradation, 564— 
565 

Pseudomonas fluorescens 
antifungal metabolite of, 612-613 
in Bacillus thuringiensis toxin production, 
669, 675-676 
as biosensor, 343-344 
in dinitrotoluene degradation, 643-644 
in phytopathogen control, 618 
Pseudomonas maltopliilia, dicamba 
O-demethylase of, 787 
Pseudomonas oleovorans, in polyhydroxyal- 
kanoate production, 545 
Pseudomonas pseudoalcaligenes, in trichloro¬ 
ethylene degradation, 564 
Pseudomonas putida 

in hydrocarbon degradation, 558-559 
in plant growth promotion, 601 
siderophores of, 611 

in trichloroethylene degradation, 562-564 
Pseudomonas syringae, ice nucleation pro¬ 
teins of, 614, 616, 900-902 
Pseudorabies virus, mouse model for, 865- 
866 

Psoriasis, antisense oligonucleotides for, 

432 

PstI restriction endonuclease, 59, 62, 502- 
504 

Pteridine, in folate synthesis, 813 
pUC19 plasmid, cloning vectors for, 64—65 
Purification, of proteins, 717-719, 748-750 
Pusztai, Arpad, transgenic potato studies 
of, 927 

PWWO plasmid, 559-562 
Pyoluteorin, for phytopathogen control, 612 
Pyrosequencing, 125-128 
Pyrrolnitrin, for phytopathogen control, 612 
Pyruvate, as biodegradation product, 552 
Pyruvate carboxylase 
in fermentation, 700 
in mammalian cell expression, 280-281 
Pythium ultimum 
antibiotics for, 612-613 
control of, 614 

Q 

Quaternary ammonium compounds, for 
osmoprotection, 794 

Quenchers, in molecular beacons, 352-353 
Quiescent cell system, for fermentation, 
696-697 

R 

R (resistance) plasmids, 58 
Rabbit, transgenic, milk from, 875 


Rabies virus, vector vaccine for, 488 
RAD54 gene, in plant genetic engineering, 
747 

Radiation treatment, in transgenic poultry 
production, 886 

Radioactive isotopes, for DNA hybridiza¬ 
tion, 350 

Radioactive wastes, microbial degradation 
of, 565-567 

Ralstonia eutropha, in polyhydroxyalkanoate 
production, 544—545 
Random amplified polymorphic DNA 
(RAPD), 355-357 
Random mutagenesis, 298-303 
RAPD (random amplified polymorphic 
DNA), 355-357 
Rapeseed, see Canola 

Raphanus sativus, in phytopathogenic fusion 
protein production, 790-791 
Red protein, in quiescent cell systems, 696- 
697 

Recognition sites, of restriction endonu¬ 
cleases, 50-51 

Recombinant DNA Molecular Program 
Advisory Committee, 898-901 
Recombinant DNA technology, 47-97 
cloning vectors in, see Cloning vector(s) 
definition of, 47 
enzymes for, 56 

for eukaryotic protein cloning, 80-86, see 
also Heterologous protein produc¬ 
tion, in eukaryotic cells 
format of, 47 

genomic library creation and screening 
in, 68-80 
history of, 5-6 

for large-scale production, see 
Bioreactor(s); Commercial 
product(s); Large-scale production 
oligonucleotides for, 104—105 
products of, purification of, 207-208 
for prokaryotic protein transformation, 
92-94 

regulation of, 898-900 
restriction endonucleases in, see 
Restriction endonucleases 
for SAGE method, 160-163 
summary, 95 

Recombinase, in marker removal, 752- 
753 

Recombination 

in attenuated-vaccine production, 482 
in DNA integration into chromosomes, 
223-224 

double-crossover, 263 
homologous, in transgenic mouse pro¬ 
duction, 851-855 
nonhomologous random, 304 
site-specific, in baculovirus-insect cell 
expression system, 265-267 
in vector vaccine production, 486-487 
Recombinational cloning (Gateway technol¬ 
ogy), 178-181 


Red blood cell antigens, glycosidase altera¬ 
tions in, 395 

Refugia, for toxin resistance prevention, 672 
Regulations, 897-922 

for genetically modified food and food 
ingredients, 903-911 
crops, 907-910 
livestock, 910-911 

produced by microorganisms, 903-907 
history of, 898-900 
for patents, 912-919 

Regulatory proteins, in gene expression, 

196 

Rejection, of transplanted organs, 876 
Release factor, in translation, 29 
Remediation, for pollution, see 

Bioremediation and biomass utiliza¬ 
tion; Phytoremediation 
Renaturation, in PCR, 110 
Rennet, 903-904 

Reovirus infections, ribozymes for, 435 
Replica plating, 93-94 
Replication, DNA, 18-20, 22 
Reporter genes 
in chloroplasts, 740 
in transformed plant cells, 741-743 
Repressor proteins, 284 
deactivation of, 198-199 
in transcription, 33-37 
Research, patenting and, 918-919 
Resistance 

antibiotic, see Antibiotic resistance 
in insects, to Bacillus thuringiensis toxin, 
671-674 

in livestock animals, to diseases, 876- 
879 
in plants 
bacteria, 787-792 
drought, 793-796 
fungi, 787-792 
herbicide, 782-787, 932-935 
insect, 759-773,932, 934-935 
oxidative stress, 792-793 
salt, 793-796 

system acquired, 788-789 
virus, 773-782 

Response elements, in transcription, 38M0 
Restriction endonucleases, 49-57 
commercial production of, 501-504 
cutting sites in, 50-51 
for genomic library creation, 68-70 
maps of, 52-56 
naming of, 50 

for random insertion/deletion mutagene¬ 
sis, 301-303 

recognition sites of, 50-51 
type I, 49 
type n, 49 
type III, 49 
type IIS, 51 
type IV, 49-51 
Restriction enzymes 
for DNA shuffling, 303-304 
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modifying specificity of, 318-320 
in random mutagenesis, 300-303 
Retinal-pigment epithelium-specific pro¬ 
tein, in gene therapy, 447, 449 
Retinitis pigmentosa, mouse model for, 855, 
857 

Retrotransposons, 248 
Retrovirus(es) 

antisense oligonucleotides, protection 
against, 433^34 

as gene therapy vectors, 447-448 
packaging mutant of, 419 
for transgenic mouse production, 848- 
850 

for transgenic poultry production, 885- 
886 

Reverse tetracycline-controlled transactiva¬ 
tor protein, in transgenic mouse pro¬ 
duction, 867 
Reverse transcriptase 

for double-stranded DNA formation, 81 
in PCR, 113 
in SAGE method, 160 
Reverse transcription-polymerase chain 
reaction, 359 

Reverse-phase protein microarray analysis, 
176 

Reversible chain terminators, in DNA 
sequencing, 128-131 

Rheumatoid arthritis, ribozymes for, 436 

Rhizobitoxin, 640 

Rhizobium 

ethylene and, 640 
hydrogenase of, 631-635 
nodulation of, 635-640 
in plant growth promotion, 600 
Rhizobium etli 

in fermentation, 700 
hemoglobin of, 629 
in nitrogen fixation, 630 
Rhizobium leguminosarum, hydrogenase of, 
633-635 

Rhizobium meliloti 

antifungal enzymes of, 614 
nodulation genes of, 617, 635-640 
Rhizoctonia solani, control of, 614 
Rhizofiltration, 839-841 
Rhizosecretion, for purification, 749 
Rhizosphere, phytoremediation in, 841 
Rhodamine, for DNA hybridization, 351 
Rhodococcus rhodochrous, in antibiotic pro¬ 
duction, 527 

Rhodopsin gene defects, in mouse model, 
855 

Rhodosporidium toruloides, phenylalanine 
ammonia lyase of, 393 
Ribonuclease, see RNase 
Ribosomal RNA (rRNA), 22, 25-26, 739 
Ribosome(s), RNA in, 25-26, 29 
Ribosome-binding site, 212-213 
Ribozyme(s) 
hairpin, 434 
hammerhead, 434 


in targeted genetic engineering, 748 
therapeutic, 434^437 
Ribulose bisphosphate carboxylase, 763, 
794-795 

Rice 

chitinase gene in, 789-790 
erect-leaf phenotype of, 836-837 
ferritin-enriched, 813-814 
folate-enriched, 812-813 
genetically modified 

nutritional content of, 925 
public acceptance of, 726 
regulation of, 909 
golden, 811-812, 925-926 
insect resistance in, 765 
phytopathogen resistance in, 789 
salt-tolerant, 794-795 
siderophores in, 834 
vitamin A-enriched, 811-812, 925-926 
Rice blast, fungal, 787 
Ricin, for Bacillus thuringiensis resistance 
prevention, 771-773 
Rickettsia rickettsii, vaccine for, 492 
Ripening, fruit, 796-799 
RISC (RNA-induced silencing complex), in 
transgenic mouse production, 860 
Rituximab, 402 

RNA, see also Messenger RNA (mRNA); 
Transfer RNA (tRNA) 
antisense, 426^34 
aptamers of, 437-440 
in chimeric RNA-DNA molecules, 437, 
746 

vs. DNA, 15 
double-stranded 

in Caenorhabditis elegans, 429 
in interference, 440442 
interference of, 440442 
in insect resistance, 768-769 
in transgenic mouse production, 859- 
861 

in viral resistance, 774-779 
PCR monitoring of, 359 
from retroviruses, for transgenic mouse 
production, 848-850 
ribozymes of, 434-437 
sequences, patenting of, 916-917 
short hairpin, 441-442, 860-861 
small interfering, 440-442, 860-861 
structure of, 21-22 
synthesis of, summary, 45 
targeted alterations to, 747-748 
transcription of, see Transcription 
translation of, see Translation 
types of, 22 

very small (micro-RNAs), 782 
RNA polymerase(s), 22-23 
in chloroplast transcription, 739 
in DNA microarray analysis, 156 
Escherichia coli, 33 
in gene expression, 198 
holoenzyme, 196 
in plant gene expression, 743 


in protein-protein interaction, 182-183 
in transcription, 33-35, 38 
RNAi (RNA interference), 440442 
insect toxicity and, 768-769 
in transgenic mouse production, 859-861 
in viral resistance, 774—779 
RNA-induced silencing complex, in trans¬ 
genic mouse production, 860 

RNase 

bull semen, 309-310 
in cloning DNA sequences, 84 
human pancreatic, adding disulfide 
bonds to, 309-310 

in transgenic mouse production, 860-861 
RNase 10-23, therapeutic, 436437 
RNase H 

in cloning DNA sequences, 81-82 
of retroviruses, 433434 
RNase III, for virus resistance, 779-780 
Robotic systems, in metagenomics, 150-151 
Rockman, Alexis, The Farm (painting), 11 
Rocky Mountain spotted fever, vaccine for, 
492 

Rodriguez, R., 59 

Rooster combs, hyaluronic acid from, 545- 
546 

Root(s) 

bacteria attached to, 644-645 
colonization of 

by growth-promoting bacteria, 618-619 
by nitrogen-fixing bacteria, see 
Nitrogen fixation 
microbial insecticides for, 668-670 
nodules on, see Nodulation 
phytate activity in, 606-608 
protein secretion from, 749 
Rose, pigmentation manipulation in, 824 
Rotavirus, edible vaccine for, 832 
RPE65 gene, delivery of, 447, 449 
Rpo proteins, in transcription, 33 
rRNA (ribosomal RNA), 22, 25-26, 739 
Rubber, commercial production of, 541-542 

s 

SI promoter, 204 
Saccharification, 570, 572-573 
Saccharomyces cerevisiae 

in alcohol production, 572-573, 594 
cellulase genes of, 587-588 
chaperones in, 251-253 
expression systems for, 244—253 
drawbacks of, 253 
intracellular protein production in, 
248-250 

practical products from, 244-245 
protein secretion in, 250-253 
vectors for, 245-248 
gene manipulation in, 588 
in glucoamylase production, 572-573 
in hirudin production, 251 
in low-ethanol wine production, 574 
luminescent variation of, 344 
in lycopene production, 519 
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Saccharomyces cerevisiae (continued) 
promoters, 244-246 
in protein production, 248-250 
protein secretion from, 248-250 
in starch degradation, 572-574 
in superoxide dismutase production, 248 
Saccharomyces diastaticus, in starch degrada¬ 
tion, 573 

Saccharomycopsis fibuligera, in alcohol pro¬ 
duction, 588 

Saccharopolyspora erythreae, in antibiotic pro¬ 
duction, 530-532, 694 

Safety 

of Bacillus thuringiensis toxin, 933-934 
of bovine growth hormone, 905-906 
environmental 

in fermentation process, 693 
marker genes and, 750-755 
of genetically modified crops, 907-910 
of genetically modified food, 923-932 
allergens, 927-930 

gene transfer to humans or intestinal 
microorganisms, 930-931 
labeling issues in, 931-932 
nutritional alterations and, 924-926 
toxins, 927-930 

of genetically modified products 

apprehension about, 897-900, 936-937 
guidelines for, 898-900 
Safflower, genetically modified, regulation 
of, 909 

SAGE (serial analysis of gene expression), 
160-163 

SAL (salicylate-degrading) plasmid, 558 
Sail restriction endonuclease, 59 
Salicylate-degrading plasmid, 558 
Salicylic acid, in phytopathogen resistance, 
788-789 
Salmon 

astaxanthin pigment for, 824-825 
transgenic, 888 
Salmonella 

in antigen delivery, 495^496 
attenuated vaccine for, 482^84 
PCR monitoring of, 358 
Salmonella enterica serovar Typhi, vaccine 
for, 497 

Salt, plants resistant to, 793-796 
Sandwich immunoassay, in protein 
microarray analysis, 174 
Sanger, Fred, dideoxynucleotide procedure 
of, 118-124, 133 

SARS (severe acute respiratory syndrome), 
vaccine for, 271, 466M67 
Sau3AI restriction endonuclease, 52 
Scaffoldin, in cellulosome, 589 
scFv fragment, 409, 696-697 
Sclerotium rolfsii, control of, 614 
Scorpion toxins, in baculovirus production, 
680-681 

Scours, vaccine for, 461 
Scrapie, prevention of, 877-878 
Sec proteins, in protein secretion regula¬ 
tion, 41M3 


Secondary antibodies, 339 
Secretion pathways, for proteins, 40-44 
Seed(s), see also Corn; Rice; Wheat 
amino acid modification in, 803-805 
bacterial treatment of, 603 
oleosins in, 748-749 

storage proteins of, modification of, 803- 
805 

Selectable marker genes 
in mammalian expression, 278 
removal of, 227-228, 750-755 
in Saccharomyces cerevisiae systems, 246- 
248 

in transformed plant cells, 741-743 
two-gene system for, 744 
Selection, of plasmid cloning vectors, 60-63 
SELEX (systematic evolution of ligands by 
exponential enrichment), 438-439 
Senescence 
drought-induced, 796 
of fruit, regulation of, 796-799 
Sense suppression, in flower pigmentation 
manipulation, 822 
Sensitivity, of tests, 333 
Sequence degeneracy, 298-299 
Serial analysis of gene expression (SAGE), 
160-163 

Serine acetyltransferase, in L-cysteine pro¬ 
duction, 517 

Serine, cysteine production from, 517 
Serratia marcescens 

antifungal enzymes of, 614 
in Bacillus thuringiensis toxin action, 676 
Severe acute respiratory syndrome (SARS), 
vaccine for, 271, 466^467 
Severe combined immunodeficiency disor¬ 
der, gene therapy for, 446 
Sfil recognition sites, 67-68 
Shear stress, in bioreactors, 704 
Sheep 

cloning of, 871-873 
scrapie in, 877-878 
transgenic, milk from, 875 
Shiga toxin, edible vaccine for, 832 
Shigella flexneri, in DNA vaccine produc¬ 
tion, 477-478 
Shikimate pathway, 784 
Shine-Dalgarno sequence, 28, 33 
Shotgun cloning strategy, for DNA 
sequencing, 133-136 
Shotgun proteomics, 169 
Shuffling 

of complementarity-determining regions, 
416-417 

DNA, 303-304, 325-326, 382 
Shuttle vectors, 67-68, 242-244 
Sialylation, in Pichia pastoris system, 254- 
255 

Sialyltransferase, in baculovirus-insect cell 
expression system, 267, 269-270 
Sickle-cell disease 
globin alleles in, 349 
molecular diagnosis of, 367-368 


Siderophores, 603 
action of, 610-612, 833-834 
in bioremediation, 646-647 
genes of, 611-612 
in pathogen control, 608-612 
plant vs. bacterial, 610 
structure of, 609-610, 834 
synthesis of, 609 

SIGEX (substrate-induced gene expression) 
screening, 151-153 
Sigma factor(s) 

in Bacillus thuringiensis sporulation, 661- 
662 

Escherichia coli, 33 
RNA polymerase, 200-203 
o D , 202-203 
o s , 203 

Signal peptide (signal sequence) 
in protein secretion, 229-230 
Saccharomyces cerevisiae, 246, 250-251 
Signal recognition particles, in protein 
secretion, 44 

Silage fermentation, 576-577 
Simian immunodeficiency virus 
in mammalian cell expression, 281 
vaccine for, 479 

Simian virus 40, as vector, 272-273 
Sindbis virus, vector vaccine for, 488 
Single accessory pathway, 43 
Single-chain antibodies, for virus resistance, 
781-782 

Single-chain Fv fragment, 409 
in fermentation, 696-697 
for virus resistance, 781-782 
Single-nucleotide polymorphism analysis, 
for ancestry determination, 361-362 
Sinorhizobium meliloti 
nif genes of, 627 
nodulation genes of, 635-640 
siRNAs (small interfering RNAs), 440-442, 
451M53, 860-861 

Smal restriction endonuclease, 52,105 
Small biological molecules, commercial 
production of, 506-521 
Small interfering RNAs (siRNAs), 440M42, 
451^53, 860-861 
Smallpox, vaccine for, 459, 486 
SMART strategy, for cDNA amplification, 
113 

Snowdrop plant, insect toxicity of, 767- 
768 

Societal issues 

economic, see Economic issues 
impact of genetically modified organisms 
in environment, 932-935 
regulations in, see Regulations 
safety of genetically modified foods, 923- 
932 

summary, 937-938 

Sodium/hydrogen antiport protein, for 
osmoprotection, 795 
Solubilization, of proteins, 718-719 
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Solvents 

disposal of, see Bioremediation and bio¬ 
mass utilization; Phytoremediation 
for microbial cell disruption, 714—715 
Somatotropin, see Growth hormone 
Sonication 

for DNA fragmentation, 133 
for microbial cell disruption, 715 
Southern blotting, of forensic samples, 356 
Southern com rootworm, toxins for, 768 
Soybean 

ferritin gene from, 813-814 
genetically modified 
export of, 936-937 
intestinal fate of, 930 
nutritional content of, 925 
regulation of, 907 
hydrogenase of, 631 
lysine-enriched, 805 
nitrogen fixation in, 619-620, 631 
oil from, modification of, 805-808 
phytic acid-enriched, 815 
vitamin E-enriched, 810-811 
Spacer regions 
in DNA synthesis, 100,103 
in gene expression, 200-201 
Sparging, in bioreactors, 701-705 
Spatial-refuge strategy, for Bacillus thuringi- 
ensis resistance prevention, 772 
Specific growth rate 
in fermentation, 691-692 
of microbes, 688-689 

Specific (cognate) modification enzymes, in 
restriction nuclease production, 502 
Specificity, of tests, 333 
Spectinomycin resistance gene 
in chloroplasts, 740-741 
in marker gene removal, 754—755 
Spike protein, SARS virus, 466-467 
Spleen, monoclonal antibody preparation 
from, 338-339 
Splicing 

of mRNA, aberrant, 432^33 
in transcription, 23-24 
Spodoptera frugiperda, baculovirus in, 263, 
267 

Spongiform encephalopathy, prevention of, 
878 

Spruce sawfly, baculovirus effects on, 678 

SPT15 gene, of yeast, 579-580 

Squash 

genetically modified, regulation of, 907 
viral coat proteins in, 777-778 
Stabilizing and antirepressor (STAR) ele¬ 
ment, 284 

Stable cell lines, 277-278 
Staphylococcus aureus 
methicillin-resistant, antibiotics for, 535 
resistance to, in transgenic cattle, 878-879 
subunit vaccine for, 467-468 
Staphylococcus stimulans, for mastitis pre¬ 
vention, 879 

Staphyloxanthin, production of, 535 


STAR (stabilizing and antirepressor) ele¬ 
ment, 284 

StarLink corn, 928-929 
Starch 

in alcohol production, 570-575 
vs. cellulose, 583 
chemical composition of, 569 
in fructose production, 570-576, 819-820 
gelatinization of, 570 
increasing content of, 820-821 
industrial applications of, modification 
for, 818-821 

in isopropanol production, 577 
microbial degradation of, 569-580 
modification of, 818-821 
in xanthan gum production, 535-538 
Starch synthase, in starch synthesis, 818- 
819 

Starch synthase promoter, for vegetable dis¬ 
coloration control, 816 
Starch-binding domain, 573-574 
Starch-branching enzyme, 818-819 
Start codons, in translation, 214 
Stationary phase, of microbial growth, 687- 
689 

Stearic acid, in plants, 805-808 
Stem cells, embryonic 
for transgenic mouse production, 848- 
850 

for transgenic poultry production, 886 
Sterilization, of bioreactors, 702-704 
Sticky ends, in restriction endonucleases, 
49-50.56 

Stirred-tank bioreactors, 701-705 
Stop codons, 30 
Strep tavidin 

for DNA hybridization, 350-352 
for immunoquantitative real-time PCR, 
361 

overexpression of, 235 
Streptococcus, group C, in hyaluronic acid 
production, 545-547 

Streptococcus equisimilis, in hyaluronic acid 
production, 547 

Streptococcus mutans, vaccine for, 479-480 
Streptococcus sobrinus, vaccine for, 479^480 
Streptokinase, decreasing sensitivity of, 
317-318 
Streptomyces 

antibiotic production in, 522-529, 531, 
533-534 

gene expression in, modulation of, 526- 
527 

Streptomyces antibioticus, in melanin produc¬ 
tion, 538-539 

Streptomyces avermitilis, in antibiotic pro¬ 
duction, 527 

Streptomyces clavuligerus, in antibiotic pro¬ 
duction, 534 

Streptomyces coelicolor, in antibiotic produc¬ 
tion, 524, 527-529 

Streptomyces griseus, in antibiotic produc¬ 
tion, 527 


Streptomyces violaceoruber, in antibiotic pro¬ 
duction, 528-529 

Streptomycin resistance gene, in chloro¬ 
plasts, 740-741 
Stress, plant, 604-605 
from bacteria, 787-793 
from drought, 793-796 
from fungi, 787-793 
from herbicides, 782-787 
from insects, 759-773 
from salt, 793-796 
summary, 799-800 
from viruses, 773-782 
Stress ethylene, 604-605, 617-618 
Structural genes, 23, 33, 37-38 
Subproteomes, 166 

Substrate-induced gene expression (SIGEX) 
screening, 151-153 
Subtilisins 

modifying metal cofactor requirements 
of, 316 

modifying multiple properties of, 215 
Subunit vaccine(s), 463^69 
advantages and disadvantages of, 463 
cholera, 466 

foot-and-mouth disease virus, 464^66 
herpes simplex virus, 463-464 
human papillomavirus, 468^69 
influenza, 270-271 
severe acute respiratory syndrome 
(SARS), 466-467 
Staphylococcus aureus, 467-468 
Succinate/succinic acid 

as biodegradation product, 552 
commercial production of, 519-521 
Sucrose, in xanthan gum production, 535- 
538 

Sugar(s), see also specific sugars, e.g., Glucose 
catabolism of, 35-36 
in DNA structure, 14-15 
in RNA structure, 15, 21 
Sugar beet 

fructan genes in, 818 
genetically modified, regulation of, 907 
phytopathogen resistance in, 789 
Sugarcane, genetically modified, regulation 
of, 909 

Sugarcane borer. Bacillus thuringiensis toxin 
effects on, 669-670, 674-675 
Suicide proteins, 66 

Sulfhydryl residues, free, reducing number 
of, 311-312 

Sulfonylureas, plant resistance to, 784 
Sunflower 

albumin from, in lupine, 803-804 
oil from 

modification of, 805-808 
vitamin E in, 809 

"Superbug," for hydrocarbon degradation, 
557-558, 914 
Superoxide anion 
action of, 248-250 
in plant stress, 792-793 
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Superoxide dismutase 
production of, 248-250 
for superoxide anion destruction, 792- 
793 

SuperSAGE protocol, 161 

Surface display, of fusion proteins, 210-212 

Surface proteins. Staphylococcus aureus, 

468 

Sweet pepper, genetically modified, regula¬ 
tion of, 907 

Sweetness, of fruits and vegetables, 
improvement of, 817-818 
Swine, see Pig 

Synechococcus, in Bacillus thuringiensis toxin 
production, 667 
Synechocystis 

in Bacillus thuringiensis toxin production, 
667 

in vitamin E production, 809-811 
Syneresis, 819 
Synthesizers, DNA, 98-99 
System acquired resistance, 788-789 
Systematic evolution of ligands by expo¬ 
nential enrichment (SELEX), 438- 
439 

T 

tac promoter, 196-198, 204, 213, 214 
Tandem affinity purification tag procedure, 
187-188 

TaqMan protocol, 375 
Target genes 

in DNA integration into chromosome, 
224-227 

manipulation of, 218 
in random mutagenesis, 299 
unwanted amino acid removal from, 
209-210 

Target sequences, in DNA microarray anal¬ 
ysis, 155-158 

Target site, in ELISA, 335-336 
TATA sequence/box, in transcription, 38 
TATA-binding protein, 38 
Telomerase, 20 
Temperature 
in bioreactors, 702 
for fermentation, 693 
for hydrocarbon degradation, 558-559 
for inclusion body control, 219-220 
tolerance of, 291 
Templates 

in cyclic array sequencing, 137 
in PCR, 111-113 

Terminal deoxynucleotidyl transferase, in 
cloning DNA sequences, 84 
Terminal primers, in PCR, 303 
Termination, in translation, 28-29 
Tetanus toxoid. Fab fragment of, 710-711 
Tetanus, vector vaccine for, 496 
"Tet-off" and "Tet-on" systems, in trans¬ 
genic mouse production, 867 
Tetracycline regulatory system, in transgen¬ 
ic mice, 866-870 


Tetracycline repressor/transactivator pro¬ 
tein, in transgenic mouse produc¬ 
tion, 867 

Tetracycline resistance gene, 65-66 

in baculovirus-insect cell expression sys¬ 
tem, 267 

in cholera vaccine, 481^482 
in directed mutagenesis, 295 
in ethylbenzoate degradation, 561-562 
Tetracycline-controlled transactivator, in 
transgenic mouse production, 867 
Tetrahydrofolate, plants enriched with, 
812-813 

(3-Thalassemia, antisense oligonucleotides 
for, 433 

Thaumatin-like proteins, 788 
Therapeutic agent(s), 379^425, See also spe¬ 
cific agents 

alginate lyase, 390-392 
antibody genes, 443^144 
antisense RNA, 426^34 
cq-antitrypsin, 393-394 
aptamers, 437-440 
chimeric RNA-DNA molecules, 437 
cyanovirin N, 398-399 
delivery of, 444^456 
DNase I, 389-390 
enzymes, 389-395 
examples of approved proteins, 379 
glycosidases, 394-395 
growth hormone, 383-388 
interfering RNAs, 440-442 
interferons, 380-383 
interleukin-10, 396-397 
lactic acid bacteria as delivery vehicle for, 
395-399 
leptin, 397-398 

monoclonal antibodies as, 399^03 
nucleic acids as, 426^58 
pharmaceuticals, 380-388 
phenylalanine ammonia lyase, 392-393 
plants producing, 825-829 
recombinant antibodies as, 403^22 
ribozymes, 434^37 
summary, 422 

tumor necrosis factor alpha, 388 
Thermostability, of proteins, improvement 
of, 306, 308-311 

Thermus thermophilus, in fructose produc¬ 
tion, 576 

Thioredoxin, 217-218 
Threonine 

in isoleucine production, 518-519 
replacement of, in tyrosyl-tRNA syn¬ 
thetase engineering, 312-314 
Threshold cycle, of PCR, 358 
Thrombin, for fusion protein cleavage, 

210 

Thymidine, for monoclonal antibody prep¬ 
aration, 338-339 
Thymidine kinase 

in baculovirus-insect cell expression sys¬ 
tem, 267 


herpes simplex virus, 486-488 
in transgenic mouse production, 853-855 
Thymine (T) 
in DNA structure, 15-17 
in RNA structure, 21 
Thyroid-stimulating hormone, 274 
Ti (tumor-inducing) plasmid 

in chimeric gene introduction, 762 
in flower pigmentation manipulation, 

822 

infection with, 726-730 
in monellin production, 817-818 
in plant amino acid modification, 805 
in promoter isolation, 743 
vector systems derived from, 730-735 
in viral coat transfer, 776-777 
Tissue plasminogen activator, 218-219 
increasing stability and specificity of, 
321-322 
patent for, 915 

Tn5 transposon, in Bacillus thuringiensis 
toxin production, 668 
Tobacco 

in astaxanthin production, 824—825 
chitinase gene in, 789-790 
chloroplast engineering in, 740-741 
crown gall disease of, 739 
Cry toxin genes in, 767 
cytokinin suppression in, 796 
drought-resistant, 796 
field tests for, 900-902 
genetically modified, regulation of, 909- 
910 

glutathione peroxide in, 793 
glyphosate N-acetyltransferase gene in, 
786 

hemoglobin gene in, 838 
lectin genes in, 768 
mercury uptake by, 840-841 
phosphorus uptake in, 607-608 
pokeweed antiviral protein in, 780 
salicylic acid production in, 789 
salt-tolerant, 794, 795 
superoxide dismutase gene in, 792-793 
viral coat proteins in, 776-777 
Tobacco budworm, baculovirus effects on, 
678-679 

Tobacco etch virus protease, for fusion pro¬ 
tein cleavage, 210 
Tobacco hornworm 

Bacillus thuringiensis toxin effects on, 669 
toxins for, 768 

Tocopherol (vitamin E), plants enriched 
with, 808-811 

Tod protein, in trichloroethylene degrada¬ 
tion, 562-564 

TOL (toluene-degrading) plasmid, 558 
Toluene, degradation of, 644-645 
Toluene dioxygenase 
in radioactive-waste degradation, 566- 
567 

in trichloroethylene degradation, 562-564 
Toluene-degrading plasmid, 558 
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Tomato 

1-aminocyclopropane-carboxylate deami¬ 
nase of, 605, 799 

Bacillus thuringiensis toxin gene in, 761- 
762 

bacterial root colonization in, 619 
Flavr Savr, 797 
genetically modified 

nutritional content of, 925 
regulation of, 907 
in monellin production, 817-818 
plastids of, 741 

polygalacturonase production in, 797 
ripening regulation in, 797, 799 
salt-tolerant, 795 
toxins in, 927 
worms infesting, 761-762 
Tomato bushy stunt virus, 781-782 
Tooth decay, vaccine for, 479M80 
Torenia, flower wilting regulation in, 799 
Toxic compounds, in environment 
biosensors for, 343-345 
dealing with, see Bioremediation and bio¬ 
mass utilization; Phytoremediation 
Toxin(s) 

Bacillus thuringiensis, 653-677, 767 
in baculovirus production, 680-681 
in genetically modified foods, 927-930 
Trade, genetically engineered food and, 
936-937 

Trademarks, 912 

Transaldolase, in alcohol production, 592 
Transcription, 21-26 
in bacteria, 33-37 
eukaryotic, 23, 37-40 
genomic studies of, 154 
initiation of, 22-25 
prevention of, 39—40 
prokaryotic, 23-24, 26-27 
regulation of, 33^0 
vs. replication, 22 

repressor protein effects on, 198-199 
Transcription factor X box protein 1, 282 
Transcription factors, 38, 40, 182-183 
Transcription terminators, 214 
in antibiotic production, 527 
in chloroplast transformation, 739 
Transcriptional terminator, 23 
Transcriptomics, 154 
Transduction, of baculovirus, 275-278 
Transfection, 242 
Transfer RNA (tRNA), 22, 26 
amber suppressor, 304 
charged,29 

mRNA interactions with, 27-29 
overexpression of, 214 
Transfer vectors, baculovirus, 263-264 
Transferred DNA 

in crown gall disease, 727-733 
in plants, 743-744, 752-753 
Transformation 

in directed mutagenesis, 292 
of plasmid cloning vectors, 60-63 
of prokaryotes, 92-94 


Transformation efficiency, 61 
Transformation frequency, 60-62 
Transgenesis, definition of, 846 
Transgenic animals, see Animal(s), trans¬ 
genic; Mice, transgenic 
Transgenic plants, see Plant(s), genetic engi¬ 
neering of 

Transient cell lines, 277-278 
Transient-expression systems, for antibody 
production, 829 

Transketolase, in alcohol production, 592 
Translation, 26-39 
efficiency of, 212-214 
in eukaryotes, 26-28 
eukaryotic protein modification after, 
240-242 

modifications after, 29 
in prokaryotes, 26-29 
vectors for, 212-214 
Translation control sequences, 273-274 
Transplantation 

donor organs for, from transgenic ani¬ 
mals, 875-876 
rejection in, 402—403 
Transposon Tn5, in Bacillus thuringiensis 
toxin production, 668 
Transribosylzeatin, in crown gall disease, 
728-730 

Transzeatin, in crown gall disease, 728-730 
trc promoter, 198 

Trehalose, for osmoprotection, 794-795 
Triazines, plant resistance to, 784 
Tricaprylin, 150 

Tricarboxylic acid cycle, acetate removal in, 
699-700 

Trichloroacetic acid, in DNA synthesis, 100 
Trichloroethylene, microbial degradation 
of, 562-564 

Trichoderma, expression systems for, 259 
Trichoderma harzianum 
antifungal enzymes of, 614 
chitinase effects on, 790 
Trichoderma reesei 
glucosidase gene of, 588 
in Pichia pastoris system, 254 
Trinitrotoluene, phytoremediation of, 841 
Triosephosphate isomerase, amino acid 
changes in, 310-311 

Tripartite fusion protein, in fermentation, 
708 

Tripartite mating, 93-94 

tRNA, see Transfer RNA (tRNA) 

trp promoter, 196-198, 218 

trp repressor protein, 197 

Trypanosoma cruzi, detection of, 349-350 

Trypsin, production of, 241 

Trypsinogen, 241 

Tryptophan 

commercial production of, 515-516 
gene expression for, 197, 202 
in indigo production, 512-514 
in insulin B peptide production, 708-709 
recombinant, 904-905 
Tryptophan decarboxylase, in tobacco, 768 


tTA gene, in transgenic mouse production, 
867, 869 

TTGACA sequence, 33 
Tuberculosis, vector vaccine for, 492-494 
Tumor necrosis factor alpha, 388, 436 
Tumor suppressor protein p53, 279 
Tumor-inducing (Ti) plasmid, see Ti (tumor- 
inducing) plasmid 

Tungsten particles, for gene transfer, 736- 
738 

Turnip mosaic virus, 782 
Turnip yellow mosaic virus, 782 
Two-hybrid system, for protein-protein 
interaction, 182-185 

Two-plasmid system, for large-scale pro¬ 
duction, 202 

"Two-tag" tandem-affinity purification tag 
procedure, 187-188 
2(im plasmid, 244-245, 250 
Type I secretion pathway, 43 
Type II secretion pathway, 43-44 
Type III secretion pathway, 43 
Tyrosinase, in melanin production, 539 
Tyrosine, in melanin production, 538-539 
Tyrosine-tRNA synthetase, 304 
Tyrosyl-tRNA synthetase, engineering of, 
312-314 

u 

Ubiquitous chromatin-opening elements, 
284-285 

Ultrafiltration, for microbial cell harvesting, 
714 

Undecylprodigiosin, production of, 524 
Untranslated regions (UTRs), in mammali¬ 
an expression, 274 
Upstream region, of DNA, 22-23 
Uracil (U), in RNA structure, 21-22 
Uracil-N-glycosylase, in directed mutagen¬ 
esis, 295, 298 

Urease, Helicobacter pylori, 496-497 
U.S. Department of Agriculture regulations 
for genetically modified crops, 908 
for release of genetically modified organ¬ 
isms, 901-902 

V 

Vaccine(s), 459-500, see also specific patho¬ 
gens 

adjuvants for, 476^77 
aerosol, 491 

antigens for, plants producing, 828, 830- 
832 

attenuated, 480—486 
bacterial antigen delivery systems for, 
494-496 

under development, 461-462 
DNA, 460^61, 472-480 
edible, 830-832 
for fish infections, 443^44 
history of, 459M.60 
inactivated, 460 

for mass vaccination campaigns, 491 
mucosal, 476, 491, 495—496 
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Vaccine(s) (continued) 
multiprotein complexes for, 270-272 
peptide, 469^72, 477 
production of, 460^463 
for pseudorabies virus, 866 
public fear of, 459^60 
subunit, 463^69 
summary, 497 
types of, 460-461 
vector, 486^97 

Vaccinia virus, for vector vaccines, 479, 
486-491 

l- Valine, commercial production of, 518- 
519 

Valine-tRNA synthetase, 305 
Variable regions, of antibodies, 401 
Varicella-zoster virus, as vaccine vector, 491 
Vascular endothelial growth factor, pegap- 
tanib binding to, 439^440 
Vector(s) 
bicistronic, 274 
cloning, see Cloning vector(s) 
double-cassette, 274 

for eukaryote protein modification, 242- 
244 

expression, see Expression vectors 
fusion, 206-207 
for gene therapy, 447-451 
high-capacity, 861-863 
for mammalian cell expression, 272-279 
baculovirus, 275-278 
design of, 272-275 
selectable markers for, 278 
Saccharomyces cerevisiae, 245-248 
shuttle, 67-68, 242-244 
for transgenic mouse production, 848- 
850 

translation, 212-214 
two-vector systems, 274 
vaccine, 486^97 
Vector vaccine(s), 486^497 
bacillus Calmette-Guerin, 493 
bacteria as antigen delivery for, 494^497 
cholera, 495 

directed against bacteria, 493^494 
directed against viruses, 486^493 
Helicobacter pylori, 496-497 
hepatitis B virus, 488 
herpes simplex virus, 488 
influenza virus, 488 
rabies virus, 488 

Rocky Mountain spotted fever, 492 
Salmonella as antigen delivery for, 495- 
496 

Sindbis virus, 488 
tetanus, 496 
tuberculosis, 492-494 
typhoid fever, 497 
vaccinia virus for, 486^491 
vesicular stomatitis virus, 488 
Vegetable(s) 

discoloration of, 815-817 

oils from, modification of, 805-808 

sweetness of, 817-818 


Vesicles, proteins stored in, 44 
Vesicular stomatitis virus, vector vaccine 
for, 488 

Vibrio cholerae, see Cholera 
Vibrio fischeri, as biosensor, 343 
Vip proteins, of Bacillus thuringiensis, 767 
i nr genes, of Ti plasmid, 727, 733 
Viral capsid protein 1 

foot-and-mouth disease virus, 465, 470- 
471 

human papillomavirus, 469 
Viroids, plant infections with, 780 
Virus(es), see also specific viruses 
in antibody production, in plants, 829 
plants resistant to, 773-782 
coat proteins for, 773-779 
micro-RNAs for, 782 
multiple viruses, 777-779 
pokeweed antiviral protein for, 780- 
781 

RNase III, 779-780 
single-chain antibodies for, 781-782 
RNAi interactions with, 440^442 
Vitamin A, plants enriched with, 811-812, 
925-926 

Vitamin E, plants enriched with, 808-811 
Vitamin(s), in plants, modification of, 808- 
813 

Vitellogenins, of medaka, 889-890 
Vitreoscilla 

in antibiotic production, 533 
hemoglobin of, 220-221, 629, 694, 838 
vp37 gene, of vaccinia virus, 488 

W 

Wastes, dealing with, see Bioremediation 
and biomass utilization; 
Phytoremediation 

Water balance, abnormal, plants resistant 
to, 793-796 

Watermelon mosaic virus, 777-778 
Watson, James, 16-17, 20,142 
Western corn rootworm 

Bacillus thuringiensis toxin effects on, 676- 
677 

insecticides for, 934 
RNA interference in, 769 
Wet milling, for microbial cell disruption, 
715-716 

Wheat 

erect-leaf phenotype of, 836-837 
virus resistance in, 779-780 
Whey, in xanthan gum production, 537 
Whitefly, 768 
Wilting 

in flooding, 605 
regulation of, 796-799 
Wines, low-ethanol, 574—575 
World Health Organization, Codex 

Alimentarius Commission of, 911 

X 

X chromosome, in ancestry determination, 
361-364 


Xanthan gum, commercial production of, 
535-538 

Xanthomonas campestris, in gum production, 
535-538 

Xanthomonas maltophilia, in fermentation, 
694 

Xbp-1 (X box protein 1), 282 
Xenobiotics, microbial degeneration of, 
551-569 

Xenografts, from transgenic animals, 875- 
876 

XenoMouse, 407-408, 863 
Xenotransplantation, from transgenic ani¬ 
mals, 875-876 

Xhol restriction endonuclease, 52 
Xmal restriction endonuclease, 52 
XYL (xylene-degrading) plasmid, 558 
Xylan(s), 583 
Xylanase 

adding disulfide bonds to, 308-309 
rhizosecretion of, 749 
Xylene-degrading plasmid, 558 
Xylose 

in alcohol production, 593-594 
conversion to D-xylulose, 575-576 
xylS gene product, in ethylbenzoate degra¬ 
dation, 558-562 

Xylulokinase, in alcohol production, 591 
D-Xylulose, conversion from D-xylose, 575- 
576 

Y 

Y chromosome, in ancestry determination, 

361-364 

Y ions, in protein separation, 169 
Yarrowia lipolytica expression system, 255 
YCF1 gene, for phytoremediation, 839-840 
Yeast(s), see also specific yeasts, e.g., 

Saccharomyces cerevisiae 
cell-disrupting procedures for, 715 
gene engineering of, 578-580 
harvesting of, 711-714 
in plant manipulation for phytoremedia¬ 
tion, 839-840 

Yeast artificial chromosomes, 242, 248 
for gene transfer, 738 
for transgenic mouse production, 862 
for XenoMouse, 407-408 
Yeast episomal plasmids (YEps), 245-247, 
250-252 

Yeast integrating plasmids (Yips), 245, 252 
YebF protein, 231-232 
YEps (yeast episomal plasmids), 245-247, 
250-252 

Yips (yeast integrating plasmids), 245, 252 

Z 

Zinc finger proteins, in endonucleases, 320 
Zucchini yellow mosaic virus, 777-778 
Zymomonas mobilis 

in alcohol production, 589-595 
cellulase genes of, 587-588 
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